State of CalibratorPredictorBase v1

There have been some issues concerning calibrator estimators (#1871 and #1622) but not calibrators themselves.

So, calibrated models are basically wrappers for model that have.

They are ultimately something akin to `CalibratedPredictorBase`. The trouble with `CalibratedPredictorBase` is this property:

So, consider this code.

https://github.com/dotnet/machinelearning/blob/578c188876cd7f158aca214d2e729500183b588c/docs/samples/Microsoft.ML.Samples/Dynamic/PermutationFeatureImportance/PFIRegressionExample.cs#L21-L29

Focus on the last part, where we're able to get the feature weights.

What is this `SubPredictor`? It is this:

https://github.com/dotnet/machinelearning/blob/578c188876cd7f158aca214d2e729500183b588c/src/Microsoft.ML.Data/Prediction/Calibrator.cs#L130

Great news: it has a definite type! Bad news: that is just a marker interface. As a mechanism for the API, it is as useless as if it were just, say, of type `object` (which, incidentally, I will have to do anyway as part of #2251). For that reason, we see lots of code that looks like this:

https://github.com/dotnet/machinelearning/blob/578c188876cd7f158aca214d2e729500183b588c/test/Microsoft.ML.Tests/TrainerEstimators/LbfgsTests.cs#L66

https://github.com/dotnet/machinelearning/blob/578c188876cd7f158aca214d2e729500183b588c/test/Microsoft.ML.Predictor.Tests/TestIniModels.cs#L584

This reason is, that object by itself is not useful: to get the actual model parameters, you have to do a "magical cast" to somehow get it into the right format. This sort of worked in command-line land or entry-point land, since everything was more or less dynamically typed anyway.

It might be desirable that, when training a logistic regression binary classification model, I should be able to, in a type safe fashion, be able to inspect the model weights without having to perform any "magical casts."

The most obvious solution to me is the following: calibrators becomes some sort of class which involves generics on both the "subpredictor" model parameters, as well as the calibrator. Then things like logistic regression would return instances of that generic class, or else some class derived from that generic class if we decide that must be an abstract class for some reason.7

The alternative is: we accept that "magical casts" are desirable, which I would not like since it is a little silly since I view the above "desirable" state is perfectly reasonable. But some people really hate generics.

I believe @yaeldekel had some thoughts on this, perhaps others do as well.

	var pipeline = mlContext.Transforms.Concatenate("Features", featureNames)
	.Append(mlContext.Transforms.Normalize("Features"))
	.Append(mlContext.Regression.Trainers.OrdinaryLeastSquares(
	labelColumn: labelName, featureColumn: "Features"));
	var model = pipeline.Fit(data);

	// Extract the model from the pipeline
	var linearPredictor = model.LastTransformer;
	var weights = PfiHelper.GetLinearModelWeights(linearPredictor.Model);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

State of CalibratorPredictorBase v1 #2378

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

State of CalibratorPredictorBase v1 #2378

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions