Skip to content

Failed in using MultiClassClassification trainers other than StochasticDualCoordinateAscent with error "System.ArgumentOutOfRangeException: 'Schema mismatch for label column '': expected Key<U4>, got R4"  #2656

Closed
@darren-zdc

Description

@darren-zdc

Issue

I'm trying to use other MulticlassClassification trainers but never succeed. The only one succeeded is StochasticDualCoordinateAscent. If i change to LogisticRegression or NaiveBayes, there will always be a error "System.ArgumentOutOfRangeException: 'Schema mismatch for label column '': expected Key, got R4".

MultiData.cs

public class MultiData
    {
        [LoadColumn(0)]
        public string DataValue { get; set; }
        [LoadColumn(1)]
        public float Label { get; set; }
    }

MultiDataPrediction.cs

public class MultiDataPrediction
    {
        public float[] Score { get; set; }
    }

BuildTrainEvaluateAndSaveModel() function

            // STEP 1: Common data loading configuration
            IDataView trainingDataView = mlContext.Data.ReadFromTextFile<MultiData>(TrainMultiDataPath1, hasHeader: false);
            IDataView testDataView = mlContext.Data.ReadFromTextFile<MultiData>(TestMultiDataPath, hasHeader: false);

            // STEP 2: Common data process configuration with pipeline data transformations          
            var dataProcessPipeline = mlContext.Transforms.Text.FeaturizeText(outputColumnName: DefaultColumnNames.Features, inputColumnName: nameof(MultiData.DataValue))
                .Append(mlContext.Transforms.Text.NormalizeText("NormalizedData", nameof(MultiData.DataValue)))
                .Append(mlContext.Transforms.Text.TokenizeCharacters("DataChars", "NormalizedData"))
                .Append(new NgramExtractingEstimator(mlContext, "BagOfTrichar", "DataChars",
                            ngramLength: 3, weighting: NgramExtractingEstimator.WeightingCriteria.TfIdf));

            // (OPTIONAL) Peek data (such as 2 records) in training DataView after applying the ProcessPipeline's transformations into "Features" 
            //ConsoleHelper.PeekDataViewInConsole<MultiData>(mlContext, trainingDataView, dataProcessPipeline, 2);
            //ConsoleHelper.PeekVectorColumnDataInConsole(mlContext, DefaultColumnNames.Features, trainingDataView, dataProcessPipeline, 1);

            // STEP 3: Set the training algorithm, then create and config the modelBuilder          
            var trainer = mlContext.MulticlassClassification.Trainers.NaiveBayes(labelColumn: nameof(MultiData.Label), featureColumn: DefaultColumnNames.Features);
            var trainingPipeline = dataProcessPipeline.Append(trainer);

            // STEP 4: Train the model fitting to the DataSet
            Console.WriteLine("=============== Training the model ===============");
            ITransformer trainedModel = trainingPipeline.Fit(trainingDataView);

Remark:
Even I change the type of the MultiData.Label to UInt32 will not be working as well.
With Error, "System.ArgumentOutOfRangeException: 'Schema mismatch for label column '': expected Key, got U4"

Activity

Ivanidzo4ka

Ivanidzo4ka commented on Feb 20, 2019

@Ivanidzo4ka
Contributor

related to ##2628

darren-zdc

darren-zdc commented on Feb 20, 2019

@darren-zdc
Author

Thanks for your reply!!
I solve it by adding
.Append(mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: DefaultColumnNames.Label, inputColumnName: nameof(MultiData.Label)));

Maybe should add this line in all the MultiClass Classification samples, since all the samples are using SDCA, and SDCA will actually auto doing the keyMapping. That will be excellent for all the new learners~

ghost locked as resolved and limited conversation to collaborators on Mar 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Ivanidzo4ka@darren-zdc

        Issue actions

          Failed in using MultiClassClassification trainers other than StochasticDualCoordinateAscent with error "System.ArgumentOutOfRangeException: 'Schema mismatch for label column '': expected Key<U4>, got R4" · Issue #2656 · dotnet/machinelearning