Closed
Description
I am trying to use PermutationFeatureImportance (PFI) with F# but the F# type system is not resolving ITransformer to ISingleFeaturePredictionTransformer - which is required by PFI.
I believe it is due to IPredictorProducing (and related interfaces) being marked as "internal".
F# supports explicit interfaces and maybe that is the reason for this issue.
Here is a snippet of code that shows what I am trying to do
(I am using the latest bits - v 1.2.0 at the time of this post)
let mutable schema = null
let mdl = ctx.Model.Load(@"F:\fwaris\data\t\analysis\model_cv_LightGbmBinary.bin", &schema)
let mdlt = mdl :?> TransformerChain<ITransformer>
let m1 = mdlt.LastTransformer //debugger shows it is Microsoft.ML.Data.BinaryPredictionTransformer<Microsoft.ML.IPredictorProducing<float>>
let scored = mdl.Transform(trainView)
scored.Preview()
ctx.BinaryClassification.PermutationFeatureImportance(m1 :?> _,scored)
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
fwaris commentedon Jul 12, 2019
work around for now is to use a C# helper given below but really if an interface (IPredictorProducing) is going to be exposed via another public interface, it should not really be marked internal.
eerhardt commentedon Aug 2, 2019
@fwaris - I just ran into this issue as well. I don't understand how your workaround works. What
T
is getting passed intoMLHelper<T>
?@codemzs - this is the same issue as we were discussing today. I don't think it is possible to use
PermutationFeatureImportance
once a model is saved to disk.This is an issue because if you use AutoML, it always saves the model to disk in order to save on memory.
The problem is this code:
machinelearning/src/Microsoft.ML.Data/Scorers/PredictionTransformer.cs
Lines 595 to 601 in bb00e07
Whenever you load a predition transformer from a model stream, it is always creating an instance of a
new BinaryPredictionTransformer<IPredictorProducing<float>>
. This object cannot be cast to anISingleFeaturePredictionTransformer<TModel>
that is necessary for callingPermutationFeatureImportance
because theT
in this case (IPredictorProducing<float>
) is internal.We need to change the above code to save off the right type into the model, and create an instance of
BinaryPredictionTransformer<TModel>
, whereTModel
is the type that was originally used when training the pipeline before saving to disk - for example,BinaryPredictionTransformer<CalibratedModelParametersBase<LightGbmBinaryModelParameters, PlattCalibrator>>
when using LightGbm./cc @Dmitry-A @justinormont
[-]IPredictorProducing 'internal' is causing issues with F# type resolution[/-][+]It is not possible to use PermutationFeatureImportance from a model loaded from disk[/+]fwaris commentedon Aug 3, 2019
@eerhardt, it seems you can punt on the type resolution in F# by using an underscore; i.e. the following trick seems to work (I tested again just to make sure):
The 'model' variable is of the concrete type (from debugger):
Microsoft.ML.Data.BinaryPredictionTransformer<Microsoft.ML.IPredictorProducing>
However I agree with you that this area requires re-work to make it easier to use.
Addresses #3976 about using PFI with a model loaded from disk (#4262)
antoniovs1029 commentedon Dec 20, 2019
Hi. So PRs #4262 and #4306 fixed the problem Eric pointed out in his comment in this thread.
So please, let us know if this has been fixed for you. Particularly, those PRs where only tested for ML.NET on C#, so I would appreciate feedback from the F# side. I will rename and tag this issue as F# specific then, since that was your original problem.
[-]It is not possible to use PermutationFeatureImportance from a model loaded from disk[/-][+]It is not possible to use PermutationFeatureImportance from a model loaded from disk in F#[/+]artemiusgreat commentedon Dec 26, 2019
This is not fixed yet.
There are 2 ways to save the model.
1. As a pipeline + estimator
https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/save-load-machine-learning-models-ml-net
2. As an estimator, without pipeline
https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/explain-machine-learning-model-permutation-feature-importance-ml-net#train-the-model
Then loading from the disk.
1. As a pipeline + estimator - model contains only pipeline transformers, including MapValueToKey and Concatenate, there is no way to get actual trainer / estimator and use it for PFI. LastTransformer property will return Concatenate transformer, but PFI requires an estimator, e.g. LighGbm or Regression
2. As an estimator without pipeline - now I see LightGbm trainer in the list of TransformationChain, but CreatePredictionEngine raises an exception "Features" column is not defined, because in this case model was saved as a pure estimator, without pipeline
4 remaining items
artemiusgreat commentedon Dec 27, 2019
The only thing I needed is to run PFI using model loaded from file. As far as it works, I'm happy
antoniovs1029 commentedon Dec 31, 2019
Hi, @artemiusgreat . So I am not sure: is your problem solved or not?
I believe it should be possible to access the
lastTransformer
directly from the model you saved to disk on the "1. As a pipeline + estimator" point by simply using:I am not sure why would you need to use the
.SelectMany(...)
method you mentioned.PFI doesn't require an estimator, but a Prediction Transformer. So, in your example, the LightGbm trainer is also an estimator, and once it is trained (with
.Fit()
) it returns a Prediction Transformer of typeMulticlassPredictionTransformer<OneVersusAllModelParameters>
. You should pass this last transformer to PFI, and not the trainer or estimator:pfi = ML.MulticlassClassification.PermutationFeatureImportance(predictor, data);
If you are still facing problems, please share with us the complete code and dataset you're using, so that I can take a closer look. Thanks.
artemiusgreat commentedon Mar 13, 2020
@antoniovs1029 Sorry, missed your comment. Yest it was fixed. Thanks.
antoniovs1029 commentedon Jun 4, 2020
So I've just tested the original scenario of this issue, on F#, and now it works... so it was indeed fixed by PRs #4262 and #4306 .
fwaris commentedon Jun 21, 2020
Also, confirming that it works.
See this issue comment for some tricks that help when working with AutoML outputs
dotnet/docs#19006 (comment)
Note: The fix works in a compiled F# project but not in F# interactive (fsi) because the current fsi is bound to older libraries. I expect that it will work in the new preview version of fsi but I have not tested that yet.