-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Cannot set the threshold on a binary predictor #2465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
One suggestion is to make the Something like this? var model = mlContext.BinaryClassification.Trainers.LogisticRegression().Fit(train);
var predictor = model.LastTransformer;
predictor.Threshold = 0.01;
// or
predictor.SetThreshold(0.01);
predictor.SetThresholdColumn("Bananas"); |
@Ivanidzo4ka I am pulling these out of Tests/ and into a new Functional.Tests that is does not have This test relies on |
YAY! The first value of these tests is now realized. 😄 |
For anyone looking for this functionality, there is a workaround discussed in #2645. |
Note that we are doing this exact scenario in our samples: // The dataset we have is skewed, as there are many more non-spam messages than spam messages.
// While our model is relatively good at detecting the difference, this skewness leads it to always
// say the message is not spam. We deal with this by lowering the threshold of the predictor. In reality,
// it is useful to look at the precision-recall curve to identify the best possible threshold.
var inPipe = new TransformerChain<ITransformer>(model.Take(model.Count() - 1).ToArray());
var lastTransformer = new BinaryPredictionTransformer<IPredictorProducing<float>>(mlContext, model.LastTransformer.Model, inPipe.GetOutputSchema(data.Schema), model.LastTransformer.FeatureColumn, threshold: 0.15f, thresholdColumn: DefaultColumnNames.Probability); |
@eerhardt This is broken as of 0.11. |
I think we need to fix it. This is a “take back” in functionality that we said we needed for v1. |
I've added these back to Project 13. Let's discuss these in scrum today and update the issues once we have made a group decision. |
Discussion from scrum: These will be stretch goals for V1 and will be taken up after the rest of the Project 13 issues are exhausted. The justification is that these are rather advanced options, and they are technically possible to implement without helper APIs (see the workaround proposed here). |
Since I'm working on this.
I'm making changes which allow user change threshold, but only during prediction time, during metric calculation they are remain same, because we have Scorer and we have Evaluator and they don't respect each other. Also Evaluator thing let user specify things like:
and we expose nothing in our evaluate method. |
I think it makes sense to keep them different. Setting the threshold on the predictor actually changes the pipeline, whereas changing the evaluator lets you ask "what if" scenarios. Plus the Evaluator needs to threshold for AUC and AUPRC anyways, so we (probably) won't be able to toss that code. I don't hold this opinion very strongly, though. |
Technically, this issue started by stating that we should be able to set the Note that in practice, we have a |
It is no longer possible to set a custom
Threshold
andThresholdColumn
on a binary classifier.Previously, we had been using
BinaryPredictionTransformer
. Recently,BinaryPredictionTransformer
was marked asinternal
and is no longer available for usage outside of the library.Related to question #403
The text was updated successfully, but these errors were encountered: