Skip to content

IRISClassification sample -MultiLabel calssification : Getting exception while referring slotnames #2810

Closed
@prathyusha12345

Description

@prathyusha12345

@Ivanidzo4ka
I am trying to do multilabel classification on IRISClassification. I am referring to this link https://github.com/dotnet/machinelearning/blob/master/test/Microsoft.ML.Tests/Scenarios/Api/Estimators/PredictAndMetadata.cs#L41

While I am running the code I am getting below exception 'Invalid call to 'GetGetter'' while accessing slotnames.

image

Activity

Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor

Thank you for reporting this.
I'm working on this issue right now.
Problem is what we have internal convention to treat all SlotNames as text (they called Names for a reason) but deep inside if origin of key type is something other than string we don't do proper casting.
@TomFinley Am I right what at this point only way to get original label values is to access KeyValue annotations on PredictedLabel?

self-assigned this
on Mar 1, 2019
Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor

To unblock your self you can change definition of slotNames from
VBuffer<ReadOnlyMemory<char>> to
VBuffer<float> and I would assume it would give you original keys. But I will change that functionality in next release.

prathyusha12345

prathyusha12345 commented on Mar 1, 2019

@prathyusha12345
Author

@Ivanidzo4ka after using
VBuffer Getting compile time error as below.

image

Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor

Oh, right, this is one of our assumptions what slotnames should be strings.
this one should do the trick:
predEngine.OutputSchema[""].Annotations.GetValue(AnnotationUtils.Kinds.SlotNames, ref slotNames);

prathyusha12345

prathyusha12345 commented on Mar 1, 2019

@prathyusha12345
Author

@Ivanidzo4ka it showing AnnotationUtils in accessible due to its protection level

Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor

We did a good job on hiding our internals.

VBuffer<float> keys = default;
engine.OutputSchema[nameof(IrisPrediction.PredictedLabel)].GetKeyValues(ref keys);

this one?

prathyusha12345

prathyusha12345 commented on Mar 1, 2019

@prathyusha12345
Author

As discussed getting new exception.
image

IRISPrediction class as below.

public class IrisPrediction
{
[ColumnName("label")]
public float Label;

    public float[] Score;
}

And I am doing MapToKeyValue transformation in training pipeLine as below

var trainer = mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(labelColumnName: DefaultColumnNames.Label, featureColumnName: DefaultColumnNames.Features);
var trainingPipeline = dataProcessPipeline.Append(trainer)
.Append(mlContext.Transforms.Conversion.MapKeyToValue("label", "PredictedLabel"));

Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor
 public class Iris
        {
            [LoadColumn(0)]
            public float Label;
            [LoadColumn(1)]
            public float SepalLength;

            [LoadColumn(2)]
            public float SepalWidth;

            [LoadColumn(3)]
            public float PetalLength;

            [LoadColumn(4)]
            public float PetalWidth;
        }
        public class IrisPredictions
        {
            [ColumnName("label")]
            public float Label;

            public float[] Score;
        }
void Prediction()
{
           var dataPath = GetDataPath(TestDatasets.irisLoader.trainFilename);
            var ml = new MLContext();

            var data = ml.Data.LoadFromTextFile<Iris>(dataPath);

            var pipeline = ml.Transforms.Concatenate("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth")
                .Append(ml.Transforms.Conversion.MapValueToKey(nameof(Iris.Label)))
                .Append(ml.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(
                    new SdcaMultiClassTrainer.Options { MaxIterations = 100, Shuffle = true, NumThreads = 1, }).
                    Append(ml.Transforms.Conversion.MapKeyToValue("label", "PredictedLabel")));

            var model = pipeline.Fit(data);
            var engine = model.CreatePredictionEngine<Iris, IrisPredictions>(ml);

            var testLoader = ml.Data.LoadFromTextFile<Iris>(dataPath);
            var testData = ml.Data.CreateEnumerable<Iris>(testLoader, false);

            // During prediction we will get Score column with 3 float values.

            // Let's look how we can convert key value for PredictedLabel to original labels.
            // We need to read KeyValues for "PredictedLabel" column.
            VBuffer<float> keys = default;
            engine.OutputSchema["PredictedLabel"].GetKeyValues(ref keys);

            var scoreValues = keys.DenseValues().ToArray();

            foreach (var input in testData.Take(20))
            {
                var prediction = engine.Predict(input);
                for (int i = 0; i < scoreValues.Length; i++)
                    Console.WriteLine($"{scoreValues[0]}: {prediction.Score[i]}");
}
prathyusha12345

prathyusha12345 commented on Mar 1, 2019

@prathyusha12345
Author

@Ivanidzo4ka This code is working fine. But how do we map each score to label? For example GitHub labeler sample has 'area' as label. so we get scoreValues array more than 3. How do we map all of them? Do we need to write program manually like I have written in GITHUBLabeler sample here or do we have any predefined code written in ML.Net.

Because obviously the purpose of this classification is to find the scores and map them to labels accordingly.

Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor

Console.WriteLine($"Predicted label: {scoreValues[i]}: {prediction.Score[i]}");
I probably should call scoreValues as originalLabels, or something like this.
you have two arrays, one with score values, one with original labels, they have same amount of elements and can be mapped to each over by index

Ivanidzo4ka

Ivanidzo4ka commented on Mar 1, 2019

@Ivanidzo4ka
Contributor

Yes, if you can access slotnames. They broken right now if you have non string label.

prathyusha12345

prathyusha12345 commented on Mar 1, 2019

@prathyusha12345
Author

@Ivanidzo4ka I got your point that once we get scores and labels we need to Zip them to map each label to score. But From machine learning beginner/user perspective, its difficult to understand terms like slotnames, keys and zip them label and score manually. Its confusing for learners/users.

The better way is get a dictionary of label and scores and user sorts them if needed.

4 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

    Development

    Participants

    @Ivanidzo4ka@shauheen@prathyusha12345

    Issue actions

      IRISClassification sample -MultiLabel calssification : Getting exception while referring slotnames · Issue #2810 · dotnet/machinelearning