Skip to content

Make Multiclass Linear Trainers Typed Based on Output Model Types. #2976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 20, 2019

Conversation

wschin
Copy link
Member

@wschin wschin commented Mar 15, 2019

Multiclass SDCA can train multi-class SVM but it always outputs multi-class logistic regression model. This is not correct because we should not apply softmax to SVM model.

To fix #1100, we got the following working items.

  • Clean code.
  • Create two multi-class linear models.
  • Make multiclass SDCA trainers typed.
  • Give the two new model classes better.

Framework changes:

  • LogisticRegressionMulticlassClassificationTrainer (renamed to) ---> LbfgsMaximumEntropyTrainer
  • MulticlassLogisticRegressionModelParameters (refactorized to) ---> MaximumEntropyModelParameters (for multi-class LR) and LinearMulticlassModelParametersBase (for uncalibrated cases). The two new classes are also derived from LinearMulticlassModelParametersBase. MulticlassLogisticRegressionModelParameters.
  • SdcaMulticlassClassificationTrainer (refactorized to) ---> SdcaMulticlassClassificationTrainer and SdcaNonCalibratedMulticlassClassificationTrainer. These two new classes' are derived from SdcaMulticlassClassificationTrainerBase.

API changes:

  • (rename, static API) MulticlassLogisticRegression ---> LbfgsMaximumEntropy
  • (rename, dynamic API) LogisticRegression ---> LbfgsMaximumEntropy
  • (add, static API) SdcaNonCalibrated for multi-class linear models without calibration.
  • (add, dynamic API) SdcaNonCalibrated for multi-class linear models without calibration.

@wschin wschin self-assigned this Mar 15, 2019
@codecov
Copy link

codecov bot commented Mar 15, 2019

Codecov Report

Merging #2976 into master will increase coverage by 0.02%.
The diff coverage is 77.5%.

@@            Coverage Diff             @@
##           master    #2976      +/-   ##
==========================================
+ Coverage   72.38%    72.4%   +0.02%     
==========================================
  Files         803      803              
  Lines      143569   143851     +282     
  Branches    16162    16173      +11     
==========================================
+ Hits       103924   104160     +236     
- Misses      35227    35267      +40     
- Partials     4418     4424       +6
Flag Coverage Δ
#Debug 72.4% <77.5%> (+0.02%) ⬆️
#production 68.08% <69.56%> (-0.01%) ⬇️
#test 88.61% <99.09%> (+0.06%) ⬆️
Impacted Files Coverage Δ
test/Microsoft.ML.Functional.Tests/Training.cs 100% <100%> (ø) ⬆️
...est/Microsoft.ML.Predictor.Tests/TestPredictors.cs 63.8% <100%> (ø) ⬆️
...lticlass/MulticlassDataPartitionEnsembleTrainer.cs 97.72% <100%> (ø) ⬆️
test/Microsoft.ML.Tests/OnnxConversionTest.cs 97.22% <100%> (ø) ⬆️
...soft.ML.Tests/PermutationFeatureImportanceTests.cs 100% <100%> (ø) ⬆️
...ML.Tests/Scenarios/IrisPlantClassificationTests.cs 100% <100%> (ø) ⬆️
...ios/IrisPlantClassificationWithStringLabelTests.cs 98.63% <100%> (ø) ⬆️
.../Microsoft.ML.Tests/TrainerEstimators/SdcaTests.cs 100% <100%> (ø) ⬆️
...s/Api/CookbookSamples/CookbookSamplesDynamicApi.cs 93.53% <100%> (ø) ⬆️
...Microsoft.ML.Tests/TrainerEstimators/LbfgsTests.cs 98.02% <100%> (ø) ⬆️
... and 29 more

@wschin wschin changed the title [WIP] Make SDCA Trainers Typed Based on Output Model Types. Make SDCA Trainers Typed Based on Output Model Types. Mar 15, 2019
@wschin wschin changed the title Make SDCA Trainers Typed Based on Output Model Types. Make Multiclass Linear Trainers Typed Based on Output Model Types. Mar 15, 2019
@shauheen shauheen added this to the 0319 milestone Mar 18, 2019
// verWrittenCur: 0x00010001, // Initial
// verWrittenCur: 0x00010002, // Added class names
verWrittenCur: 0x00010003, // Added model stats
verReadableCur: 0x00010001,
Copy link
Contributor

@TomFinley TomFinley Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know this is a new model type. Why is this not the initial version? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. Just fixed it.


In reply to: 266553016 [](ancestors = 266553016)

@@ -171,7 +171,7 @@ private ITransformer TrainOnIris(string irisDataPath)
var trainedModel = pipeline.Fit(trainData);

// Inspect the model parameters.
var modelParameters = trainedModel.LastTransformer.Model as MulticlassLogisticRegressionModelParameters;
var modelParameters = trainedModel.LastTransformer.Model as MaximumEntropyModelParameters;
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaximumEntropyModelParameters [](start = 72, length = 29)

obligatory "Update md file" comment #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do now! Thanks.


In reply to: 266639456 [](ancestors = 266639456)

/// <example>
/// <format type="text/markdown">
/// <![CDATA[
/// [!code-csharp[SDCA](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/StochasticDualCoordinateAscent.cs)]
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StochasticDualCoordinateAscent [](start = 127, length = 30)

Should you add new sample file for non calibrated version? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to touch too many those APIs as they are being renamed and moved.


In reply to: 266641719 [](ancestors = 266641719)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then don't add example? I don't know how our documentation people track their work, but looks like it has everything it's needed, but in reality it's not.


In reply to: 266650832 [](ancestors = 266650832,266641719)

Copy link
Member Author

@wschin wschin Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes a big wrong thing we have been doing for years (as you can see, we need so many changes to fix it). I personally consider example is rather less important. In addition, we now postpone all documentation tasks.


In reply to: 266987401 [](ancestors = 266987401,266650832,266641719)

{
internal const string Summary = "Logistic Regression is a method in statistics used to predict the probability of occurrence of an event and can be used as a classification algorithm.The algorithm predicts the probability of occurrence of an event by fitting data to a logistical function.";
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.


In reply to: 266642838 [](ancestors = 266642838)

/// </summary>
public IEnumerable<float> GetBiases()
{
return Biases;
}

internal IEnumerable<float> DenseWeightsEnumerable()
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly related to your PR, but I found it strange what we expose weights and Bias(es) differently for binary and multiclass model.
#ByDesign

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason behind is that those parameters got one extra dimension. In binary, the bias is a scalar but in multiclass, the bias is a vector with #-of-class elements.


In reply to: 266661678 [](ancestors = 266661678)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is weird.
NumClasses should be property of the model, bias and weights should be IReadonly etc, but all this is out of scope of this PR:)


In reply to: 266668443 [](ancestors = 266668443,266661678)

private protected override void Calibrate(Span<float> dst)
{
Host.Assert(dst.Length >= NumberOfClasses);

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is dst.Lenght> NumberOfClasses is ok? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


In reply to: 266662595 [](ancestors = 266662595)

wschin added 6 commits March 18, 2019 16:16
Step 2: Make SDCA trainers typed

Finish version 0.1

Delete commented lines
More document
Fix two tests and address a comment

Add missing piece

namespace Microsoft.ML.Trainers
{
/// <include file = 'doc.xml' path='doc/members/member[@name="LBFGS"]/*' />
/// <include file = 'doc.xml' path='docs/members/example[@name="LogisticRegressionClassifier"]/*' />
public sealed class LogisticRegressionMulticlassClassificationTrainer : LbfgsTrainerBase<LogisticRegressionMulticlassClassificationTrainer.Options,
MulticlassPredictionTransformer<MulticlassLogisticRegressionModelParameters>, MulticlassLogisticRegressionModelParameters>
public sealed class LbfgsMaximumEntropyTrainer : LbfgsTrainerBase<LbfgsMaximumEntropyTrainer.Options,
Copy link
Member

@abgoswam abgoswam Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have prefix match between the Trainer and ModelParameter class . Perhaps call the trainer MaximumEntropyTrainer ? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the only trainer that will produce MaximumEntropyTrainer.. Please consider this trainer and the associated model separately. Otherwise, every trainer which produces linear model should be called LinearModelTrainer.


In reply to: 266683935 [](ancestors = 266683935)

/// <param name="optimizationTolerance">Threshold for optimizer convergence.</param>
public static LogisticRegressionMulticlassClassificationTrainer LogisticRegression(this MulticlassClassificationCatalog.MulticlassClassificationTrainers catalog,
public static LbfgsMaximumEntropyTrainer LbfgsMaximumEntropy(this MulticlassClassificationCatalog.MulticlassClassificationTrainers catalog,
Copy link
Member

@abgoswam abgoswam Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its great we are following convention of having prefix match with the name of Trainer #Resolved

/// <include file='doc.xml' path='doc/members/member[@name="SDCA_remarks"]/*' />
public sealed class SdcaNonCalibratedMulticlassClassificationTrainer : SdcaMulticlassClassificationTrainerBase<LinearMulticlassModelParameters>
{
public class Options : CommonOptions
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public class Options [](start = 8, length = 20)

sealed? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks.


In reply to: 266991181 [](ancestors = 266991181)

/// <include file='doc.xml' path='doc/members/member[@name="SDCA_remarks"]/*' />
public sealed class SdcaMulticlassClassificationTrainer : SdcaMulticlassClassificationTrainerBase<MaximumEntropyModelParameters>
{
public class Options : CommonOptions
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public class Options [](start = 8, length = 21)

sealed? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks.


In reply to: 266991361 [](ancestors = 266991361)

() => TrainerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.LabelColumnName),
() => TrainerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.ExampleWeightColumnName));
/// <summary>
/// Output the text model to a given writer
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output the text model to a given writer [](start = 11, length = 40)

nit: dot in the end of sentence for all this comments in file. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I just checked both of this file and SdcaMulticlass.cs.


In reply to: 266993940 [](ancestors = 266993940)


// Step 4: Make prediction and evaluate its quality (on training set).
var prediction = model.Transform(data);
var metrics = mlContext.MulticlassClassification.Evaluate(prediction, label: "LabelIndex", topK: 1);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label [](start = 82, length = 5)

I think @artidoro rename them to labelColumnName in his latest PR. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged with master. Thanks.


In reply to: 266997235 [](ancestors = 266997235)

/// [!code-csharp[SDCA](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/StochasticDualCoordinateAscentWithOptions.cs)]
/// ]]></format>
/// </example>
public static SdcaCalibratedMulticlassTrainer SdcaCalibrated(this MulticlassClassificationCatalog.MulticlassClassificationTrainers catalog,
Copy link
Contributor

@artidoro artidoro Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SdcaCalibratedMulticlassTrainer [](start = 22, length = 31)

nit: if you are doing another iteration, would be great if you keep the two extensions for SdcaCalibratedMulticlassTraniner one after the other. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it in next iteration.


In reply to: 267060562 [](ancestors = 267060562)

@artidoro
Copy link
Contributor

artidoro commented Mar 19, 2019

    public static LogisticRegressionBinaryTrainer LogisticRegression(this BinaryClassificationCatalog.BinaryClassificationTrainers catalog,

Can we drop the Binary from LogisticRegressionBInaryTrainer? #WontFix


Refers to: src/Microsoft.ML.StandardTrainers/StandardTrainersCatalog.cs:519 in 63b46cb. [](commit_id = 63b46cb, deletion_comment = False)

@@ -537,7 +587,7 @@ public static PoissonRegressionTrainer PoissonRegression(this RegressionCatalog.
}

/// <summary>
/// Predict a target using a linear multiclass classification model trained with the <see cref="LogisticRegressionMulticlassClassificationTrainer"/> trainer.
/// Predict a target using a linear multiclass classification model trained with the <see cref="LbfgsMaximumEntropyTrainer"/> trainer.
/// </summary>
Copy link
Contributor

@artidoro artidoro Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth it to add a few words saying that this is the multi class extension of logistic regression? (here and below) #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will become

/// Predict a target using a maximum entropy classification model trained with the L-BFGS method implemented in <see cref="LbfgsMaximumEntropyTrainer"/>.

In reply to: 267062024 [](ancestors = 267062024)

writer.WriteLine("output[{0}] = Math.Exp(scores[{0}] - softmax);", c);
}

private protected override bool SaveAsOnnxCore(OnnxContext ctx, string[] outputs, string featureColumn)
Copy link
Contributor

@artidoro artidoro Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SaveAsOnnxCore [](start = 40, length = 14)

It looks like the following three methods are very similar, if not the same, for LinearMulticlassModelParameters and MaximumEntropyModelParameters.
Can they be refactored in the base class? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks.


In reply to: 267075643 [](ancestors = 267075643)

@wschin
Copy link
Member Author

wschin commented Mar 19, 2019

    public static LogisticRegressionBinaryTrainer LogisticRegression(this BinaryClassificationCatalog.BinaryClassificationTrainers catalog,

I need to do it in another PR if you really want (#3016). It will cause another batch of conflicts and is not related to this issue.


In reply to: 474542216 [](ancestors = 474542216)


Refers to: src/Microsoft.ML.StandardTrainers/StandardTrainersCatalog.cs:519 in 63b46cb. [](commit_id = 63b46cb, deletion_comment = False)

@artidoro
Copy link
Contributor

    public static LogisticRegressionBinaryTrainer LogisticRegression(this BinaryClassificationCatalog.BinaryClassificationTrainers catalog,

Ok, sounds good!


In reply to: 474562462 [](ancestors = 474562462,474542216)


Refers to: src/Microsoft.ML.StandardTrainers/StandardTrainersCatalog.cs:519 in 63b46cb. [](commit_id = 63b46cb, deletion_comment = False)

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wschin wschin merged commit 3af9a5d into dotnet:master Mar 20, 2019
@wschin wschin deleted the typed-lr branch March 20, 2019 00:19
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Getting SDCA multiclass weights requires using a MulticlassLogisticRegressionPredictor, which is strange
6 participants