-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add a project for functional tests without visibility into internals of ML.NET #2470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2470 +/- ##
==========================================
+ Coverage 71.21% 71.22% +<.01%
==========================================
Files 786 787 +1
Lines 140960 141038 +78
Branches 16110 16116 +6
==========================================
+ Hits 100385 100450 +65
- Misses 36109 36120 +11
- Partials 4466 4468 +2
|
<NativeAssemblyReference Condition="'$(OS)' != 'Windows_NT'" Include="tensorflow_framework" /> | ||
</ItemGroup> | ||
<ItemGroup> | ||
<PackageReference Include="Microsoft.ML.TensorFlow.TestModels" Version="0.0.7-test" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Zeeshan is in the process of publishing version 0.0.8-test
. We should extract this value into a common place here:
machinelearning/build/Dependencies.props
Lines 42 to 46 in 834e471
<!-- Test-only Dependencies --> | |
<PropertyGroup> | |
<BenchmarkDotNetVersion>0.11.3</BenchmarkDotNetVersion> | |
<MicrosoftMLTestModelsPackageVersion>0.0.3-test</MicrosoftMLTestModelsPackageVersion> | |
</PropertyGroup> |
Same for the Onnx TestModels below. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created global build variables for those and set them to the latest versions across the projects (they were out of sync). Will resolve once I see that the builds & tests pass.
In reply to: 255165705 [](ancestors = 255165705)
|
||
<PropertyGroup> | ||
<AssemblyName>Microsoft.ML.Functional.Tests</AssemblyName> | ||
<AllowUnsafeBlocks>true</AllowUnsafeBlocks> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need unsafe
blocks (at least not right now). Let's leave this off for now. #Resolved
<Project Sdk="Microsoft.NET.Sdk"> | ||
|
||
<PropertyGroup> | ||
<AssemblyName>Microsoft.ML.Functional.Tests</AssemblyName> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to set AssemblyName
in new .csproj files. This value gets defaulted to the file's name. You can remove this line. #Resolved
var model = pipeline.Fit(train); | ||
|
||
var scoredTest = model.Transform(test); | ||
var metrics = mlContext.Regression.Evaluate(scoredTest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be asserting the metrics are in a certain range? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on checking valid ranges. I added a Common library to add those sorts of checks to.
In reply to: 255167478 [](ancestors = 255167478)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you missed checking that new function in.
In reply to: 255228907 [](ancestors = 255228907,255167478)
/// and configures the scorer (or more precisely instantiates a new scorer over the same predictor) | ||
/// with some threshold derived from that. | ||
/// </summary> | ||
[Fact(Skip = "Blocked by issue #2465")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test can still be run, right? Since you are commenting out the code that is being blocked. Maybe remove the Skip here, and add a TODO to the commented out line below. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start @rogancarr. Thanks for doing this. I'm glad we've already found an API issue.
Just a few items to clean up, then let's get this in.
<AssemblyName>Microsoft.ML.Functional.Tests</AssemblyName> | ||
<AllowUnsafeBlocks>true</AllowUnsafeBlocks> | ||
<!-- We are turning off strong naming to ensure we never add `InternalsVisibleTo` for these tests --> | ||
<SignAssembly>false</SignAssembly> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to set PublicSign
to false as well, to unblock the Mac and Linux builds. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip! I couldn't figure out why those builds broke!
In reply to: 255168732 [](ancestors = 255168732)
@@ -17,7 +17,12 @@ public static class DatasetUtils | |||
/// Downloads the housing dataset from the ML.NET repo. | |||
/// </summary> | |||
public static string DownloadHousingRegressionDataset() | |||
=> Download("https://github.com/raw/dotnet/machinelearning/024bd4452e1d3660214c757237a19d6123f951ca/test/data/housing.txt", "housing.txt"); | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change necessary in this PR?
I don't think our tests should be calling this method - BTW #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is necessary if we want to use LoadHousingRegressionDataset
in our tests because there is a race condition on the file lock, so tests will sometimes fail.
Can you explain a bit more why you don't want to use this in tests? Is it that we don't want to use the SamplesUtils project in Tests, or that we shouldn't be downloading data for tests?
If it's the former, check out issue #2420 . We're going to make this a standalone Datasets/ (or some such name) outside of the NuGet project to use in Samples and Tests.
If it's the latter, we are already downloading datasets for tests. But now that I mention it, we can actually add an optional input to LoadHousingRegressionDataset and friends that can load the file from the tests/data/ directory. I'll add this capability now.
In reply to: 255236393 [](ancestors = 255236393)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 datasets our tests should use.
- Datasets checked into
test\data
. - Datasets that are downloaded into
test\data\external
through theDownloadExternalTestFiles
build step.
We shouldn't have the test code be downloading random things. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This has been updated to use the local dataset in the test\data
folder. I'll chase down any other tests using these Download
commands as I migrate API-Scenario tests to Functional.Tests/
In reply to: 255264819 [](ancestors = 255264819)
<Project Sdk="Microsoft.NET.Sdk"> | ||
|
||
<PropertyGroup> | ||
<AllowUnsafeBlocks>false</AllowUnsafeBlocks> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need this line at all. It can be removed. #Resolved
Microsoft.ML.sln
Outdated
@@ -928,6 +930,18 @@ Global | |||
{5E920CAC-5A28-42FB-936E-49C472130953}.Release-Intrinsics|Any CPU.Build.0 = Release-Intrinsics|Any CPU | |||
{5E920CAC-5A28-42FB-936E-49C472130953}.Release-netfx|Any CPU.ActiveCfg = Release-netfx|Any CPU | |||
{5E920CAC-5A28-42FB-936E-49C472130953}.Release-netfx|Any CPU.Build.0 = Release-netfx|Any CPU | |||
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug|Any CPU.ActiveCfg = Debug|Any CPU | |||
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug|Any CPU.Build.0 = Debug|Any CPU | |||
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug-Intrinsics|Any CPU.ActiveCfg = Debug|Any CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you manually add these? They look wrong to me.
For example, this line should be:
{CFED9F0C-FF81-4C96-8D5E-0436264CA7B5}.Debug-Intrinsics|Any CPU.ActiveCfg = Debug-Intrinsics|Any CPU
and the -netfx
ones should have -netfx
on the right side too.
This is probably why CI is failing.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied the lines from Microsoft.ML.Tests/ and swapped in the correct GUID.
In reply to: 255266430 [](ancestors = 255266430,255266128)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks Rogan.
<NativeAssemblyReference Include="LdaNative" /> | ||
<NativeAssemblyReference Include="SymSgdNative" /> | ||
<NativeAssemblyReference Include="MklProxyNative" /> | ||
<NativeAssemblyReference Include="MklImports" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are those all needed a this point in time? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <summary> | ||
/// Reconfigurable predictions: The following should be possible: A user trains a binary classifier, | ||
/// and through the test evaluator gets a PR curve, the based on the PR curve picks a new threshold | ||
/// and configures the scorer (or more precisely instantiates a new scorer over the same predictor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predictor [](start = 97, length = 9)
model Parameters #Resolved
[Fact] | ||
public void ReconfigurablePrediction() | ||
{ | ||
var mlContext = new MLContext(seed: 789); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seed: 78 [](start = 42, length = 8)
this intentional? #Resolved
var mlContext = new MLContext(seed: 789); | ||
|
||
// Get the dataset, create a train and test | ||
var dataset = DatasetUtils.LoadHousingRegressionDataset(mlContext, BaseTestClass.GetDataPath("housing.txt")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoadHousingRegressionDataset [](start = 39, length = 28)
just reference the housing dataset like the other tests are doing. Let's leave DatasetUtils for the samples. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, you are defining the TextLoader etc in there. I believe there is a separate test file for those, in TestDatasets.
In reply to: 255278973 [](ancestors = 255278973)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I went ahead and added this just like the other tests do. It looks like we'll need to do some refactoring so that we can push most of the loader and options into the dataset object as well, but that can wait for later.
In reply to: 255279066 [](ancestors = 255279066,255278973)
please take look at the comments about not using DatasetUtils in tests. Thanks. In reply to: 461637125 [](ancestors = 461637125) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds a new project,
Microsoft.ML.Functional.Tests
for adding end-to-end scenario tests for ML.NET. This project does not have visibility into the main library, and is not strongly named, so may not be added.Two tests were also moved from
Microsoft.ML.Tests
into this project:CrossValidation
: Migrated with additional testsReconfigurablePrediction
: Migrated but marked asSkip
because Issue Cannot set the threshold on a binary predictor #2465 prevents us from completing the scenario.Fixes #2306