Skip to content

Use PFI with Binary Prediction Transformer and CalibratedModelParametersBase loaded from disk #4292

Closed
@antoniovs1029

Description

@antoniovs1029

In my last accepted pull request (#4262 ) I addressed issue #3976 and was able to provide working samples and tests for using PFI with models loaded from disk except for the case of Binary Prediction Transformer. Here I open this issue about that specific problem.

Problem

In the sample using PFI with binary classification the last transformer of the model (i.e. the linearPredictor) is of type BinaryPredictionTransformer<CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>.

Problem is that when saving and then loading that model from disk, a null reference is returned when trying to access the last transformer by casting it to the original type.

// linearPredictor is null:
var linearPredictor = (loadedmodel as TransformerChain<ITransformer>).LastTransformer as BinaryPredictionTransformer<CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>; 

Having a null linearPredictor makes it unusable with PFI.

In version 1.3 of ML.Net the last transformer of the loaded model would actually be of type BinaryPredictionTransformer<IPredictorProducing<float>>

With the changes I made in my last PR (which will be available in version 1.4.0 preview 2) the loaded model's last transformer would be of type BinaryPredictionTransformer<ParameterMixingCalibratedModelParameters<IPredictorProducing<float>, ICalibrator>> which is a step forward in solving the problem, but is not yet enough.

As stated, in both cases, a cast to the original type would return null. In general, it would be expected that the user tries to make that cast in order to use PFI, failing to accomplish it.

This problem would be solved if the loaded model actually had a lastTransformer of the original type, or something castable to it.

Workaround

Based on this comment made by @yaeldekel I've just made this working sample of using PFI with a binary prediction transformer loaded from disk. It is pretty much the same as the original sample, only that it works with a model loaded from disk.

The key of the workaround is that the user should cast the lastTransformer not into a binary prediction transformer but rather into a ISingleFeaturePredictionTransformer<object>, and then do a series of casts to get whatever other object s/he may want to get from inside the lastTransformer.

In the sample I've just provided it works pretty much in this way:

var linearPredictor = (loadedmodel as TransformerChain<ITransformer>).LastTransformer as ISingleFeaturePredictionTransformer<object>;
var predictorModel = linearPredictor.Model as CalibratedModelParametersBase;
var predictorSubModel = predictorModel.SubModel as LinearBinaryModelParameters;

Notice that this workaround worked even in ML.Net 1.3, and also works with the changes that I introduced in 1.4.0 preview 2.

Notice that a similar workaround might help a user that tries to use PFI with any kind of prediction transformer loaded from disk. This would come useful if the user, for whatever reason, can not extract the linearPredictor by casting to the same type used in the original model.

Cause of the Problem

There are 3 main points that are related to the cause of this problem, all of which pertain the Calibrator.cs file and aren't related to the binary prediction transformer itself:

  1. Unexpectedly, when loading a ParameterMixingCalibratedModelParameters<> its Create method isn't called. I discovered this while debugging, and what actually happens is that, during loading, inside the CreateInstanceCore method, it first looks for a constructor, and so it calls the constructor of ParameterMixingCalibratedModelParameters<> instead of the Create method.
  2. Currently, when loading a ParameterMixingCalibratedModelParameters<> model, a ParameterMixingCalibratedModelParameters<IPredictorProducing<float>, ICalibrator> is always loaded, no matter what the actual submodel and calibrator are. This doesn't change by fixing point 1). This point is similar to the original problem found on the prediction transformers, which I fixed in my last pull request; using a similar approach in this case would fix this point... that is, loading first the submodel and calibrator to then create a generic type at runtime with the correct parameter types.
  3. When fiting the model (i.e. before even saving it or loading it) the SdcaLogisticRegressionBinaryTrainer creates a predictor of type ParameterMixingCalibratedModelParameters<LinearBinaryModelParameters, PlattCalibrator> (which I will now refer to as "PMCMP") but returns it as a CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator> (let's call it "CMPB") this then is what makes the last transformer of the model to be a BinaryPredictionTransformer<CMPB> whereas the internal model of the last transformer is actually a PMCMP. When saving it to disk, it's saved as a PMCMP (i.e. it's saved using a LoaderSignature of "PMixCaliPredExec"), so when loading occurs, it calls the constructor of PMCMP but it doesn't cast it to a CMPB. This is different from the problem fixed in my last pull request; there, if a Regression prediction transformer was saved, then we expected to load a regression prediction transformer... whereas in here if a BPT is saved we actually want to load a PMCMP with the correct type parameters, but actually create a BPT where the CMPB should also have the correct type parameters.

Trying to solve the problem

So far I've been able to solve problems 1) and 2) described above, but after trying out different approaches I haven't been able to solve problem 3). To solve those problems I've changed different things in the Calibrator.cs file. My attempt to solve this problem can be found in PR #4306

With those changes (along with the ones of my last PR), the loaded model's last transformer becomes a BinaryPredictionTransformer<ParameterMixingCalibratedModelParameters<LinearBinaryModelParameters, PlattCalibrator>>. Notice that even here a cast to BPT<CMPB> would be null, so it doesn't solve the problem. Also notice that since PMCMP is an internal class the user wouldn't be able to cast the last transformer to BPT<PMCMP> either, since s/he wouldn't have access to that class.

Further problems

Here I've explained the specific case of loading a BPT<CMPB> with the specific problems that arise in CMPB and PMCMP classes because that is what is used in the sample of PFI with BPT, and in the tests of PFI with BPT. It could be possible that the problems here described are also present in other classes (for example in the other classes of Calibrator.cs) but they might not become a problem unless the user tries to access the last transformer of a model loaded from disk. In such a case the described workaround might help.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions