-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Simple API to go from a trainer to something that can make predictions #560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Let's try to break down what's good about the
In this example we see the caching, but there are other similar 'smarts' that happen behind the scene: auto-normalization and auto-calibration. In the absence of a 'smart pipeline' component, the users must know to do this themselves. Also, you are right about
The But maybe we can get rid of the For example, make predictors more 'rigid', less 'flexible', by essentially auto-generating There is still a question of what we do with the 'smarts'. Maybe some form of
Of course, we could keep a simpler extension method Anyway, to recap, I see two separate issues:
|
We talked some more about it, we have agreed on many things and still have disagreement on some. The Train method will produce an object capable of making predictions. A simple way to do this is to compose together the It will be capable of predictions, but it will also allow the user to inspect the individual pieces and, as necessary, manufacture a new 'prediction model' with some tweaks. The still unresolved questions are:
|
What obviously can NOT happen is, we cannot make |
After the changes to the API, the example now looks akin to: var trainer = new LinearClassificationTrainer(env, new LinearClassificationTrainer.Arguments { }, "Features", "Label");
var model = trainer.Fit(trainData);
var predictor = model.MakePredictionFunction<ExampleClass, PredictionClass>();
PredictionClass prediction = predictor.Predict(new ExampleClass(...)); This eliminates the intermediate concepts of We still have the distinction between I think at this point we should be closing this issue. @eerhardt , what are your thoughts? |
(Somehow this slipped through my radar.) Yes, I believe the current API sufficiently solves this issue. Closing. |
With the API proposal change in #371, the current proposed API looks something like:
Compare and contrast the similar code what what we have in the LearningPipeline API:
You can see the proposed API has what feels like boilerplate code (create a cache data view, create examples, call train, get a scorer, create an engine). Where the LearningPipeline API simplifies this into roughly one call: call train, get something that can make predictions.
I don't think our simplest API example should have so many concepts in it. In my mind, the main concepts a new user needs to know about are:
However, in the current proposed API, they also need to think/learn about:
RoleMappedData
, but the method is namedCreateExamples
.GetScorer
, which returns anIDataView
that we callscoredData
.scoring
as implied by the method name:GetScorer
?In my opinion, this API is too complex and non-intuitive for first time users. We should investigate ways to make it simpler and see if we can come up with a design with less concepts to learn when first interacting with ML.NET.
/cc @ericstj @TomFinley @Zruty0 @terrajobst
The text was updated successfully, but these errors were encountered: