Exception when trying to Evaluate AveragedPerceptronTrainer, LinearSvm #1579

abgoswam · 2018-11-08T17:45:14Z

For a couple of Learners, we get an exception during Evaluate

AveragedPerceptronTrainer
LinearSvm

Exception :

Message: System.ArgumentOutOfRangeException : Probability column 'Probability' not found
Parameter name: name

Sample :

    [Fact]
    public void OVA_BC_AP()
    {
        string dataPath = GetDataPath("breast-cancer.txt");

        // Create a new context for ML.NET operations. It can be used for exception tracking and logging, 
        // as a catalog of available operations and as the source of randomness.
        var mlContext = new MLContext(seed: 1);
        var reader = new TextLoader(mlContext, new TextLoader.Arguments()
        {
            Column = new[]
                    {
                        new TextLoader.Column("Label", DataKind.R4, 0),
                        new TextLoader.Column("Features", DataKind.R4, new [] { new TextLoader.Range(1, 9) }),
                    }
        });

        // Data
        var data = reader.Read(GetDataPath(dataPath));

        // Pipeline
        var pipeline = new AveragedPerceptronTrainer(mlContext, "Label", "Features");

        var model = pipeline.Fit(data);
        var predictions = model.Transform(data);

        // Metrics
        var metrics = mlContext.BinaryClassification.Evaluate(predictions);
    }

The text was updated successfully, but these errors were encountered:

yaeldekel · 2018-11-08T20:57:19Z

Not sure this is a bug - AveragedPerceptron does not produce calibrated models. If we don't expose calibration APIs, we should probably do that. Also, we may want to consider warning when the probability column isn't there instead of throwing.

Zruty0 · 2018-11-08T21:46:23Z

..or call mlContext.BinaryClassification.EvaluateNonCalibrated.

Exposing the calibration API is fine. I would just make a calibration estimator for that though, that trains towards one parameter. Embedding calibrator into learner seems somewhat unnecessary.

abgoswam · 2018-11-08T23:13:51Z

Thanks for the comments. Using mlContext.BinaryClassification.EvaluateNonCalibrated got me metrics I was looking for.

I have a few follow up questions, based on the comments above:

What do me mean by the following "make a calibration estimator that trains towards one parameter" ?
My understanding is that currently some learners have calibrator embedded (e.g. FastTree) while other learners do not (e.g. AveragedPerceptron) . Is that by design ?
From a user perspective, is there a way to know if I should use Evaluate or EvaluateNonCalibrated . I feel this may confuse users of ML.NET

@Zruty0 @yaeldekel

Zruty0 · 2018-11-08T23:23:43Z

What do me mean by the following "make a calibration estimator that trains towards one parameter" ?

I mean that a calibrator is just one peculiar form of trainer: it learns a monotonous function that transforms 'scores' into 'probabilities', with the goal to minimize the log-loss against the 'target label'. So, it is actually a univariate classification trainer. We should create a PlattCalibrationEstimator to train Platt calibrators and a PavCalibrationEstimator to train PAV calibrators.

My understanding is that currently some learners have calibrator embedded (e.g. FastTree) while other learners do not (e.g. AveragedPerceptron) . Is that by design ?

Some learners under some conditions are essentially learning a calibrated model. For example, FastTree classifier already minimizes log-loss of Sigmoid(Score) against the target label. So, if we just take a sigmoid of the score, we already have a calibrated output (even though it's calibrated against train set). For such learners, we produce models that are 'self-calibrated'. For other learners, that don't have this property, we don't.

From a user perspective, is there a way to know if I should use Evaluate or EvaluateNonCalibrated . I feel this may confuse users of ML.NET

You can inspect the schema and see if there is a Probability column. If there is, you can use Evaluate, if there isn't, you can only use EvaluateNonCalibrated.

abgoswam · 2018-11-14T23:32:48Z

Closing this issue,, since we verified that mlContext.BinaryClassification.EvaluateNonCalibrated gave us the desired metrics for AveragedPerceptron.

Created a separate issue #1622 for adding calibration estimators in ML.NET

abgoswam mentioned this issue Nov 14, 2018

Calibration estimators in ML.NET #1622

Closed

abgoswam closed this as completed Nov 14, 2018

ghost locked as resolved and limited conversation to collaborators Mar 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exception when trying to Evaluate AveragedPerceptronTrainer, LinearSvm #1579

Exception when trying to Evaluate AveragedPerceptronTrainer, LinearSvm #1579

abgoswam commented Nov 8, 2018

yaeldekel commented Nov 8, 2018

Uh oh!

Zruty0 commented Nov 8, 2018

Uh oh!

abgoswam commented Nov 8, 2018

Uh oh!

Zruty0 commented Nov 8, 2018

Uh oh!

abgoswam commented Nov 14, 2018

Uh oh!

Exception when trying to Evaluate AveragedPerceptronTrainer, LinearSvm #1579

Exception when trying to Evaluate AveragedPerceptronTrainer, LinearSvm #1579

Comments

abgoswam commented Nov 8, 2018

yaeldekel commented Nov 8, 2018

Uh oh!

Zruty0 commented Nov 8, 2018

Uh oh!

abgoswam commented Nov 8, 2018

Uh oh!

Zruty0 commented Nov 8, 2018

Uh oh!

abgoswam commented Nov 14, 2018

Uh oh!