Skip to content

TensorFlow estimator #840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 7, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion src/Microsoft.ML.Data/DataLoadSave/TrivialEstimator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,12 @@ protected TrivialEstimator(IHost host, TTransformer transformer)
Transformer = transformer;
}

public TTransformer Fit(IDataView input) => Transformer;
public TTransformer Fit(IDataView input)
{
// Validate input schema.
Transformer.GetOutputSchema(input.Schema);
return Transformer;
}

public abstract SchemaShape GetOutputSchema(SchemaShape inputSchema);
}
Expand Down
2 changes: 1 addition & 1 deletion src/Microsoft.ML.Data/Transforms/CopyColumnsTransform.cs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ public SchemaShape GetOutputSchema(SchemaShape inputSchema)
var col = new SchemaShape.Column(column.Name, originalColumn.Kind, originalColumn.ItemType, originalColumn.IsKey, originalColumn.Metadata);
resultDic[column.Name] = col;
}
return new SchemaShape(resultDic.Values.ToArray());
return new SchemaShape(resultDic.Values);
}
}

Expand Down
664 changes: 375 additions & 289 deletions src/Microsoft.ML.TensorFlow/TensorflowTransform.cs

Large diffs are not rendered by default.

87 changes: 51 additions & 36 deletions src/Microsoft.ML.TensorFlow/doc.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,26 +7,41 @@
Extracts hidden layers' values from a pre-trained Tensorflow model.
</summary>
<remarks>
The TensorflowTransform extracts the specified outputs from the operations computed on the graph (given the input(s)) using a pre-trained <a href="https://www.tensorflow.org">Tensorflow</a> model.
The transform takes as input the Tensorflow model together with the names of the inputs to the model and names of the operations for which output values will be extracted from the model.
<para>
The TensorflowTransform extracts the specified outputs from the operations computed on the graph (given the input(s)) using a pre-trained <a href="https://www.tensorflow.org">Tensorflow</a> model.
The transform takes as input the Tensorflow model together with the names of the inputs to the model and names of the operations for which output values will be extracted from the model.
</para>

This transform requires the <a href="https://dotnet.myget.org/feed/dotnet-core/package/nuget/Microsoft.ML.TensorFlow/0.5.0-preview-26830-5">Microsoft.ML.TensorFlow</a> nuget to be installed.

The TensorflowTransform has following assumptions regarding the input, output and processing of data.
<para>
This transform requires the <a href="https://dotnet.myget.org/feed/dotnet-core/package/nuget/Microsoft.ML.TensorFlow/0.5.0-preview-26830-5">Microsoft.ML.TensorFlow</a> nuget to be installed.
The TensorflowTransform has following assumptions regarding the input, output and processing of data.
</para>
<list type="number">
<item>
The transform currently accepts the <a href="https://www.tensorflow.org/mobile/prepare_models">frozen TensorFlow model</a> file as input.
<description>
The transform currently accepts the <a href="https://www.tensorflow.org/mobile/prepare_models">frozen TensorFlow model</a> file as input.
</description>
</item>
<item>
<description>The transform supports scoring only one example at a time.</description>
</item>
<item>
<description>The name of input column(s) should match the name of input(s) in Tensorflow model.</description>
</item>
<item>
<description>The name of each output column should match one of the operations in the Tensorflow graph.</description>
</item>
<item>
<description>Currently, float and double are the only acceptable data types for input/output.</description>
</item>
<item>The transform supports scoring only one example at a time.</item>
<item>The name of input column(s) should match the name of input(s) in Tensorflow model.</item>
<item>The name of each output column should match one of the operations in the Tensorflow graph.</item>
<item>Currently, float and double are the only acceptable data types for input/output.</item>
<item>
Upon success, the transform will introduce a new column in <see cref="IDataView"/> corresponding to each output column specified.
</item>
</list>

The inputs and outputs of a TensorFlow model can be obtained using the <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/README.md#inspecting-graphs"><code>summarize_graph</code> tool</a>.
The inputs and outputs of a TensorFlow model can be obtained using the <a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/README.md#inspecting-graphs">
<code>summarize_graph</code> tool
</a>.

</remarks>
</member>
Expand All @@ -39,41 +54,41 @@
{
ModelFile = model_location;
InputColumns = new []{ &quot;Input&quot; };
OutputColumn = &quot;Output&quot;
OutputColumns = &quot;Output&quot;
}
</code>
</example>
<example>
<code language="csharp">
var pipeline = new LearningPipeline(seed: 1);
pipeline.Add(new TextLoader(dataFile).CreateFrom&lt;CifarData&gt;(useHeader: false));
pipeline.Add(new ImageLoader((&quot;ImagePath&quot;, &quot;ImageReal&quot;))
{
ImageFolder = imageFolder
});
var pipeline = new LearningPipeline(seed: 1);
pipeline.Add(new TextLoader(dataFile).CreateFrom&lt;CifarData&gt;(useHeader: false));
pipeline.Add(new ImageLoader((&quot;ImagePath&quot;, &quot;ImageReal&quot;))
{
ImageFolder = imageFolder
});

pipeline.Add(new ImageResizer((&quot;ImageReal&quot;, &quot;ImageCropped&quot;))
{
ImageHeight = imageHeight,
ImageWidth = imageWidth,
Resizing = ImageResizerTransformResizingKind.IsoCrop
});
pipeline.Add(new ImageResizer((&quot;ImageReal&quot;, &quot;ImageCropped&quot;))
{
ImageHeight = imageHeight,
ImageWidth = imageWidth,
Resizing = ImageResizerTransformResizingKind.IsoCrop
});

pipeline.Add(new ImagePixelExtractor((&quot;ImageCropped&quot;, &quot;Input&quot;))
{
UseAlpha = false,
InterleaveArgb = true
});
pipeline.Add(new ImagePixelExtractor((&quot;ImageCropped&quot;, &quot;Input&quot;))
{
UseAlpha = false,
InterleaveArgb = true
});

pipeline.Add(new TensorFlowScorer()
{
ModelFile = model_location,
InputColumns = new[] { &quot;Input&quot; },
OutputColumn = &quot;Output&quot;
});
pipeline.Add(new TensorFlowScorer()
{
ModelFile = model_location,
InputColumns = new[] { &quot;Input&quot; },
OutputColumns = &quot;Output&quot;
});
</code>
</example>
</example>

</members>
</doc>
2 changes: 1 addition & 1 deletion src/Microsoft.ML/CSharpApi.cs
Original file line number Diff line number Diff line change
Expand Up @@ -15795,7 +15795,7 @@ public sealed partial class TensorFlowScorer : Microsoft.ML.Runtime.EntryPoints.
public string[] InputColumns { get; set; }

/// <summary>
/// The name of the output
/// The name of the outputs
/// </summary>
public string[] OutputColumns { get; set; }

Expand Down
6 changes: 3 additions & 3 deletions test/BaselineOutput/Common/EntryPoints/core_manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -21720,7 +21720,7 @@
"Type": "String",
"Desc": "This is the frozen protobuf model file. Please see https://www.tensorflow.org/mobile/prepare_models for more details.",
"Aliases": [
"ModelDir"
"model"
],
"Required": true,
"SortOrder": 0.0,
Expand Down Expand Up @@ -21754,9 +21754,9 @@
"Kind": "Array",
"ItemType": "String"
},
"Desc": "The name of the output",
"Desc": "The name of the outputs",
"Aliases": [
"output"
"outputs"
],
"Required": true,
"SortOrder": 2.0,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,19 +87,9 @@ public void TensorFlowTransformMNISTConvTest()
HasHeader = true,
Column = new[]
{
new TextLoader.Column()
{
Name = "Label",
Source = new [] { new TextLoader.Range() { Min=0, Max=0} },
Type = DataKind.Num
},

new TextLoader.Column()
{
Name = "Placeholder",
Source = new [] { new TextLoader.Range() { Min=1, Max=784} },
Type = DataKind.Num
}
new TextLoader.Column("Label", DataKind.Num,0),
new TextLoader.Column("Placeholder", DataKind.Num,new []{new TextLoader.Range(1, 784) })

}
}, new MultiFileSource(dataPath));

Expand Down Expand Up @@ -149,7 +139,7 @@ public void TensorFlowTransformMNISTConvTest()

float max = -1;
int maxIndex = -1;
for(int i=0;i<prediction.PredictedLabels.Length; i++)
for (int i = 0; i < prediction.PredictedLabels.Length; i++)
{
if (prediction.PredictedLabels[i] > max)
{
Expand Down
167 changes: 167 additions & 0 deletions test/Microsoft.ML.Tests/TensorFlowEstimatorTests.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using Microsoft.ML.Core.Data;
using Microsoft.ML.Runtime.Api;
using Microsoft.ML.Runtime.Data;
using Microsoft.ML.Runtime.Model;
using Microsoft.ML.Runtime.RunTests;
using Microsoft.ML.Runtime.Tools;
using Microsoft.ML.Transforms;
using System;
using System.Collections.Generic;
using System.IO;
using Xunit;
using Xunit.Abstractions;

namespace Microsoft.ML.Tests
{
public class TensorFlowEstimatorTests : TestDataPipeBase
{
private class TestData
{
[VectorType(4)]
public float[] a;
[VectorType(4)]
public float[] b;
}
private class TestDataSize
{
[VectorType(2)]
public float[] a;
[VectorType(2)]
public float[] b;
}
private class TestDataXY
{
[VectorType(4)]
public float[] A;
[VectorType(4)]
public float[] B;
}
private class TestDataDifferntType
{
[VectorType(4)]
public string[] a;
[VectorType(4)]
public string[] b;
}

public TensorFlowEstimatorTests(ITestOutputHelper output) : base(output)
{
}

[Fact]
void TestSimpleCase()
{
var modelFile = "model_matmul/frozen_saved_model.pb";

var dataView = ComponentCreation.CreateDataView(Env,
new List<TestData>(new TestData[] {
new TestData()
{
a = new[] { 1.0f, 2.0f,3.0f, 4.0f },
b = new[] { 1.0f, 2.0f,3.0f, 4.0f }
},
new TestData()
{
a = new[] { 2.0f, 2.0f,2.0f, 2.0f },
b = new[] { 3.0f, 3.0f,3.0f, 3.0f }
}
}));

var xyData = new List<TestDataXY> { new TestDataXY() { A = new float[4], B = new float[4] } };
var stringData = new List<TestDataDifferntType> { new TestDataDifferntType() { a = new string[4], b = new string[4] } };
var sizeData = new List<TestDataSize> { new TestDataSize() { a = new float[2], b = new float[2] } };
var pipe = new TensorFlowEstimator(Env, modelFile, new[] { "a", "b" }, new[] { "c" });

var invalidDataWrongNames = ComponentCreation.CreateDataView(Env, xyData);
var invalidDataWrongTypes = ComponentCreation.CreateDataView(Env, stringData);
var invalidDataWrongVectorSize = ComponentCreation.CreateDataView(Env, sizeData);
TestEstimatorCore(pipe, dataView, invalidInput: invalidDataWrongNames);
TestEstimatorCore(pipe, dataView, invalidInput: invalidDataWrongTypes);

pipe.GetOutputSchema(SchemaShape.Create(invalidDataWrongVectorSize.Schema));
try
{
pipe.Fit(invalidDataWrongVectorSize);
Assert.False(true);
}
catch (ArgumentOutOfRangeException) { }
catch (InvalidOperationException) { }
}

[Fact]
void TestOldSavingAndLoading()
{
var modelFile = "model_matmul/frozen_saved_model.pb";

var dataView = ComponentCreation.CreateDataView(Env,
new List<TestData>(new TestData[] {
new TestData()
{
a = new[] { 1.0f, 2.0f, 3.0f, 4.0f },
b = new[] { 1.0f, 2.0f, 3.0f, 4.0f }
},
new TestData()
{
a = new[] { 2.0f, 2.0f, 2.0f, 2.0f },
b = new[] { 3.0f, 3.0f, 3.0f, 3.0f }
},
new TestData()
{
a = new[] { 5.0f, 6.0f, 10.0f, 12.0f },
b = new[] { 10.0f, 8.0f, 6.0f, 6.0f }
}
}));
var est = new TensorFlowEstimator(Env, modelFile, new[] { "a", "b" }, new[] { "c" });
var transformer = est.Fit(dataView);
var result = transformer.Transform(dataView);
var resultRoles = new RoleMappedData(result);
using (var ms = new MemoryStream())
{
TrainUtils.SaveModel(Env, Env.Start("saving"), ms, null, resultRoles);
ms.Position = 0;
var loadedView = ModelFileUtils.LoadTransforms(Env, dataView, ms);
ValidateTensorFlowTransformer(loadedView);
}
}

[Fact]
void TestCommandLine()
{
using (var env = new TlcEnvironment())
{
Assert.Equal(Maml.Main(new[] { @"showschema loader=Text{col=a:R4:0-3 col=b:R4:0-3} xf=TFTransform{inputs=a inputs=b outputs=c model={model_matmul/frozen_saved_model.pb}} in=f:\2.txt" }), (int)0);
}
}

private void ValidateTensorFlowTransformer(IDataView result)
{
result.Schema.TryGetColumnIndex("a", out int ColA);
result.Schema.TryGetColumnIndex("b", out int ColB);
result.Schema.TryGetColumnIndex("c", out int ColC);
using (var cursor = result.GetRowCursor(x => true))
{
VBuffer<float> avalue = default;
VBuffer<float> bvalue = default;
VBuffer<float> cvalue = default;

var aGetter = cursor.GetGetter<VBuffer<float>>(ColA);
var bGetter = cursor.GetGetter<VBuffer<float>>(ColB);
var cGetter = cursor.GetGetter<VBuffer<float>>(ColC);
while (cursor.MoveNext())
{
aGetter(ref avalue);
bGetter(ref bvalue);
cGetter(ref cvalue);
Assert.Equal(avalue.Values[0] * bvalue.Values[0] + avalue.Values[1] * bvalue.Values[2], cvalue.Values[0]);
Assert.Equal(avalue.Values[0] * bvalue.Values[1] + avalue.Values[1] * bvalue.Values[3], cvalue.Values[1]);
Assert.Equal(avalue.Values[2] * bvalue.Values[0] + avalue.Values[3] * bvalue.Values[2], cvalue.Values[2]);
Assert.Equal(avalue.Values[2] * bvalue.Values[1] + avalue.Values[3] * bvalue.Values[3], cvalue.Values[3]);
}
}
}
}
}
Loading