-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Create model file V1 scenario tests #2899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create model file V1 scenario tests #2899
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2899 +/- ##
==========================================
- Coverage 72.53% 72.52% -0.01%
==========================================
Files 805 806 +1
Lines 144243 144500 +257
Branches 16175 16191 +16
==========================================
+ Hits 104620 104800 +180
- Misses 35222 35292 +70
- Partials 4401 4408 +7
|
Waiting for #2858 to be checked in; will incorporate those tests. |
internal sealed class ScoreColumn | ||
{ | ||
public float Score { get; set; } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this go somewhere, not on a separate file? Where the other data models are? There might even be one already. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I've consolidated all the single-column classes into one file. I had been following the Apache REEF style by habit.
In reply to: 265712606 [](ancestors = 265712606)
public void DetermineNugetVersionFromModel() | ||
{ | ||
var modelFile = GetDataPath(@"backcompat" + Path.DirectorySeparatorChar + @"keep-model.zip"); | ||
var versionFileName = @"TrainingInfo\Version.txt"; // Can't find this cross plat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\ [](start = 48, length = 1)
Path.DirectorySeparatorChar? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strangely enough, since the zip archive was made on Windows, we need to specify a '\' here or the tests will fail on Linux and Mac. I've updated this cryptic comment to explain it better.
In reply to: 265712860 [](ancestors = 265712860)
[Fact] | ||
public void DetermineNugetVersionFromModel() | ||
{ | ||
var modelFile = GetDataPath(@"backcompat" + Path.DirectorySeparatorChar + @"keep-model.zip"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ [](start = 40, length = 1)
don't think you need the @
you could also go with interpoaltion:
$"backcompat{Path.DirectorySeparatorChar}keep-model.zip" #Resolved
|
||
var modelPath = DeleteOutputPath("fitPipelineSaveModelAndPredict.zip"); | ||
// Save model to a file. | ||
using (var file = File.Create(modelPath)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create [](start = 35, length = 6)
There are overloads that take the file path, would this be an appropriate place to use these? #Resolved
/// </summary> | ||
internal sealed class FeatureContributionOutput | ||
{ | ||
public float[] FeatureContributions { get; set; } | ||
} | ||
|
||
/// <summary> | ||
/// A class to hold the Score column. | ||
/// A class to hold a feature column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feature [](start = 26, length = 7)
Is it a feature column or a score column? #Resolved
// Load model from a file. | ||
ITransformer serializedModel; | ||
using (var file = File.OpenRead(modelPath)) | ||
serializedModel = mlContext.Model.Load(file, out var serializedSchema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializedSchema [](start = 69, length = 16)
You can verify that this is the same as data.Schema
. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -30,6 +32,84 @@ private class InputData | |||
public float[] Features { get; set; } | |||
} | |||
|
|||
/// <summary> | |||
/// Model Files: The (minimum) nuget version can be found in the model file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model file [](start = 73, length = 10)
Is this an old model file or a current one? If it's current, shouldn't we create the model on the fly (similar to the other scenario below) instead of reading a static keep-model.zip file?
/// 1. I can train a model and save it to a file, including transforms. | ||
/// 2. Training and prediction happen in different processes (or even different machines). | ||
/// The actual test will not run in different processes, but will simulate the idea that the | ||
/// "communication pipe" is just a serialized model of some form. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary for tests files, but as FYI lists need to be in xml style as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As laid out in #2498 , we need scenarios to cover the Model Files functionality we want fully supported in V1.
This PR adds tests for the following scenarios:
Fixes #2896