Skip to content

R4 label works with some but not all binary classification trainers #2750

@daholste

Description

@daholste

This code:

using System;
using Microsoft.ML.Data;

namespace Microsoft.ML.Samples
{
    internal static class Program
    {
        static void Main(string[] args)
        {
            var context = new MLContext();
            var options = new TextLoader.Options()
            {
                Columns = new TextLoader.Column[]
                {
                    new TextLoader.Column("Label", DataKind.Single, 0),
                    new TextLoader.Column("Sentiment", DataKind.String, 1)
                },
                HasHeader = true
            };
            var loader = context.Data.CreateTextLoader(options);
            var data = loader.Read(@"C:\AutoMLDotNet\src\Samples\Data\wikipedia-detox-250-line-data.tsv");
            var estimator = context.Transforms.Text.FeaturizeText("Features", "Sentiment")
                .Append(context.BinaryClassification.Trainers.AveragedPerceptron());
            var transformer = estimator.Fit(data);
            var scoredData = transformer.Transform(data);
            var metrics = context.BinaryClassification.EvaluateNonCalibrated(scoredData);
            Console.WriteLine(metrics.Accuracy);
            Console.WriteLine("Press any key...");
            Console.ReadLine();
        }
    }
}

works great!

If you replace AveragedPerceptron with LogisticRegression, it throws the exception:

 'Schema mismatch for label column: expected Bool, got R4'

This may indicate a bug in label schema validation with some binary learners

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions