Skip to content

Add a project for functional tests without visibility into internals of ML.NET #2470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 9, 2019

Conversation

rogancarr
Copy link
Contributor

This PR adds a new project, Microsoft.ML.Functional.Tests for adding end-to-end scenario tests for ML.NET. This project does not have visibility into the main library, and is not strongly named, so may not be added.

Two tests were also moved from Microsoft.ML.Tests into this project:

Fixes #2306

@codecov
Copy link

codecov bot commented Feb 7, 2019

Codecov Report

Merging #2470 into master will increase coverage by <.01%.
The diff coverage is 90.32%.

@@            Coverage Diff             @@
##           master    #2470      +/-   ##
==========================================
+ Coverage   71.21%   71.22%   +<.01%     
==========================================
  Files         786      787       +1     
  Lines      140960   141038      +78     
  Branches    16110    16116       +6     
==========================================
+ Hits       100385   100450      +65     
- Misses      36109    36120      +11     
- Partials     4466     4468       +2
Flag Coverage Δ
#Debug 71.22% <90.32%> (ø) ⬆️
#production 67.56% <0%> (-0.01%) ⬇️
#test 85.31% <100%> (+0.03%) ⬆️

<NativeAssemblyReference Condition="'$(OS)' != 'Windows_NT'" Include="tensorflow_framework" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="Microsoft.ML.TensorFlow.TestModels" Version="0.0.7-test" />
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Zeeshan is in the process of publishing version 0.0.8-test. We should extract this value into a common place here:

<!-- Test-only Dependencies -->
<PropertyGroup>
<BenchmarkDotNetVersion>0.11.3</BenchmarkDotNetVersion>
<MicrosoftMLTestModelsPackageVersion>0.0.3-test</MicrosoftMLTestModelsPackageVersion>
</PropertyGroup>

Same for the Onnx TestModels below. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created global build variables for those and set them to the latest versions across the projects (they were out of sync). Will resolve once I see that the builds & tests pass.


In reply to: 255165705 [](ancestors = 255165705)


<PropertyGroup>
<AssemblyName>Microsoft.ML.Functional.Tests</AssemblyName>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need unsafe blocks (at least not right now). Let's leave this off for now. #Resolved

<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<AssemblyName>Microsoft.ML.Functional.Tests</AssemblyName>
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to set AssemblyName in new .csproj files. This value gets defaulted to the file's name. You can remove this line. #Resolved

var model = pipeline.Fit(train);

var scoredTest = model.Transform(test);
var metrics = mlContext.Regression.Evaluate(scoredTest);
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be asserting the metrics are in a certain range? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on checking valid ranges. I added a Common library to add those sorts of checks to.


In reply to: 255167478 [](ancestors = 255167478)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you missed checking that new function in.


In reply to: 255228907 [](ancestors = 255228907,255167478)

/// and configures the scorer (or more precisely instantiates a new scorer over the same predictor)
/// with some threshold derived from that.
/// </summary>
[Fact(Skip = "Blocked by issue #2465")]
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test can still be run, right? Since you are commenting out the code that is being blocked. Maybe remove the Skip here, and add a TODO to the commented out line below. #Resolved

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start @rogancarr. Thanks for doing this. I'm glad we've already found an API issue.

Just a few items to clean up, then let's get this in.

<AssemblyName>Microsoft.ML.Functional.Tests</AssemblyName>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<!-- We are turning off strong naming to ensure we never add `InternalsVisibleTo` for these tests -->
<SignAssembly>false</SignAssembly>
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to set PublicSign to false as well, to unblock the Mac and Linux builds. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip! I couldn't figure out why those builds broke!


In reply to: 255168732 [](ancestors = 255168732)

@@ -17,7 +17,12 @@ public static class DatasetUtils
/// Downloads the housing dataset from the ML.NET repo.
/// </summary>
public static string DownloadHousingRegressionDataset()
=> Download("https://github.com/raw/dotnet/machinelearning/024bd4452e1d3660214c757237a19d6123f951ca/test/data/housing.txt", "housing.txt");
{
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary in this PR?

I don't think our tests should be calling this method - BTW #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is necessary if we want to use LoadHousingRegressionDataset in our tests because there is a race condition on the file lock, so tests will sometimes fail.

Can you explain a bit more why you don't want to use this in tests? Is it that we don't want to use the SamplesUtils project in Tests, or that we shouldn't be downloading data for tests?

If it's the former, check out issue #2420 . We're going to make this a standalone Datasets/ (or some such name) outside of the NuGet project to use in Samples and Tests.

If it's the latter, we are already downloading datasets for tests. But now that I mention it, we can actually add an optional input to LoadHousingRegressionDataset and friends that can load the file from the tests/data/ directory. I'll add this capability now.


In reply to: 255236393 [](ancestors = 255236393)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.


In reply to: 255256550 [](ancestors = 255256550,255236393)

Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 datasets our tests should use.

  1. Datasets checked into test\data.
  2. Datasets that are downloaded into test\data\external through the DownloadExternalTestFiles build step.

We shouldn't have the test code be downloading random things. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This has been updated to use the local dataset in the test\data folder. I'll chase down any other tests using these Download commands as I migrate API-Scenario tests to Functional.Tests/


In reply to: 255264819 [](ancestors = 255264819)

<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<AllowUnsafeBlocks>false</AllowUnsafeBlocks>
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need this line at all. It can be removed. #Resolved

Microsoft.ML.sln Outdated
@@ -928,6 +930,18 @@ Global
{5E920CAC-5A28-42FB-936E-49C472130953}.Release-Intrinsics|Any CPU.Build.0 = Release-Intrinsics|Any CPU
{5E920CAC-5A28-42FB-936E-49C472130953}.Release-netfx|Any CPU.ActiveCfg = Release-netfx|Any CPU
{5E920CAC-5A28-42FB-936E-49C472130953}.Release-netfx|Any CPU.Build.0 = Release-netfx|Any CPU
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug|Any CPU.Build.0 = Debug|Any CPU
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug-Intrinsics|Any CPU.ActiveCfg = Debug|Any CPU
Copy link
Member

@eerhardt eerhardt Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you manually add these? They look wrong to me.

For example, this line should be:

  {CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug-Intrinsics|Any CPU.ActiveCfg = Debug-Intrinsics|Any CPU

and the -netfx ones should have -netfx on the right side too.

This is probably why CI is failing.
#Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autogenerated. I'll fix these up.


In reply to: 255266128 [](ancestors = 255266128)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the lines from Microsoft.ML.Tests/ and swapped in the correct GUID.


In reply to: 255266430 [](ancestors = 255266430,255266128)

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks Rogan.

<NativeAssemblyReference Include="LdaNative" />
<NativeAssemblyReference Include="SymSgdNative" />
<NativeAssemblyReference Include="MklProxyNative" />
<NativeAssemblyReference Include="MklImports" />
Copy link
Member

@sfilipi sfilipi Feb 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are those all needed a this point in time? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they will be when we're all done.


In reply to: 255278694 [](ancestors = 255278694)

/// <summary>
/// Reconfigurable predictions: The following should be possible: A user trains a binary classifier,
/// and through the test evaluator gets a PR curve, the based on the PR curve picks a new threshold
/// and configures the scorer (or more precisely instantiates a new scorer over the same predictor)
Copy link
Member

@sfilipi sfilipi Feb 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

predictor [](start = 97, length = 9)

model Parameters #Resolved

[Fact]
public void ReconfigurablePrediction()
{
var mlContext = new MLContext(seed: 789);
Copy link
Member

@sfilipi sfilipi Feb 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seed: 78 [](start = 42, length = 8)

this intentional? #Resolved

var mlContext = new MLContext(seed: 789);

// Get the dataset, create a train and test
var dataset = DatasetUtils.LoadHousingRegressionDataset(mlContext, BaseTestClass.GetDataPath("housing.txt"));
Copy link
Member

@sfilipi sfilipi Feb 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoadHousingRegressionDataset [](start = 39, length = 28)

just reference the housing dataset like the other tests are doing. Let's leave DatasetUtils for the samples. #Resolved

Copy link
Member

@sfilipi sfilipi Feb 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see, you are defining the TextLoader etc in there. I believe there is a separate test file for those, in TestDatasets.


In reply to: 255278973 [](ancestors = 255278973)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I went ahead and added this just like the other tests do. It looks like we'll need to do some refactoring so that we can push most of the loader and options into the dataset object as well, but that can wait for later.


In reply to: 255279066 [](ancestors = 255279066,255278973)

@sfilipi
Copy link
Member

sfilipi commented Feb 9, 2019

please take look at the comments about not using DatasetUtils in tests. Thanks.


In reply to: 461637125 [](ancestors = 461637125)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@rogancarr rogancarr merged commit 7dfadca into dotnet:master Feb 9, 2019
@rogancarr rogancarr deleted the 2306_FunctionalTests branch February 9, 2019 06:44
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants