Fixing names of trainer estimators #2903

abgoswam · 2019-03-11T05:16:36Z

codecov · 2019-03-11T06:06:45Z

Codecov Report

Merging #2903 into master will decrease coverage by <.01%.
The diff coverage is 86.71%.

@@            Coverage Diff             @@
##           master    #2903      +/-   ##
==========================================
- Coverage   71.82%   71.82%   -0.01%     
==========================================
  Files         812      812              
  Lines      142719   142719              
  Branches    16092    16092              
==========================================
- Hits       102513   102510       -3     
- Misses      35827    35830       +3     
  Partials     4379     4379

Flag	Coverage Δ
#Debug	`71.82% <86.71%> (-0.01%)`	⬇️
#production	`67.97% <77.77%> (ø)`	⬆️
#test	`86.21% <92.63%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
...ers/Standard/MultiClass/PairwiseCouplingTrainer.cs	`90.07% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/GamClassification.cs	`89% <ø> (ø)`	⬆️
...L.Mkl.Components/ComputeLRTrainingStdThroughHal.cs	`92.85% <ø> (ø)`	⬆️
...rainers/Standard/MultiClass/OneVersusAllTrainer.cs	`74.63% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/FastTreeRegression.cs	`54.5% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/GamRegression.cs	`89.09% <ø> (ø)`	⬆️
...crosoft.ML.StandardTrainers/Standard/SdcaBinary.cs	`72.68% <ø> (ø)`	⬆️
src/Microsoft.ML.LightGBM/LightGbmArguments.cs	`89.63% <ø> (ø)`	⬆️
...Microsoft.ML.Mkl.Components/OlsLinearRegression.cs	`66.3% <0%> (ø)`	⬆️
....ML.Benchmarks/KMeansAndLogisticRegressionBench.cs	`0% <0%> (ø)`	⬆️
... and 77 more

sfilipi · 2019-03-11T06:19:04Z

...les/Microsoft.ML.Samples/Dynamic/Trainers/AnomalyDetection/RandomizedPcaSampleWithOptions.cs

@@ -28,15 +28,15 @@ public static void Example()
            // Convert the List<DataPoint> to IDataView, a consumble format to ML.NET functions.
            var data = mlContext.Data.LoadFromEnumerable(samples);

-            var options = new ML.Trainers.RandomizedPrincipalComponentAnalyzer.Options()
+            var options = new ML.Trainers.RandomizedPcaAnomalyDetectionTrainer.Options()


RandomizedPcaAnomalyDetectionTrainer [](start = 42, length = 36)

can we leave "trainer" out of the name? #Resolved

used acronym for Pca. Kept suffix Trainer in the name of the class

In reply to: 264101278 [](ancestors = 264101278)

sfilipi · 2019-03-11T06:19:27Z

.../Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/AveragedPerceptronWithOptions.cs

@@ -21,7 +21,7 @@ public static void Example()
            var trainTestData = mlContext.BinaryClassification.TrainTestSplit(data, testFraction: 0.1);

            // Define the trainer options.
-            var options = new AveragedPerceptronTrainer.Options()
+            var options = new AveragedPerceptronBinaryClassificationTrainer.Options()


AveragedPerceptronBinaryClassificationTrainer [](start = 30, length = 45)

I vote for AveragePerceptron, since in this case, there aren't one for each task. #Resolved

sfilipi · 2019-03-11T06:19:43Z

...Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/StochasticDualCoordinateAscent.cs

@@ -61,7 +61,7 @@ public static void Example()
            // we could do so by tweaking the 'advancedSetting'.
            var advancedPipeline = mlContext.Transforms.Text.FeaturizeText("SentimentText", "Features")
                                  .Append(mlContext.BinaryClassification.Trainers.StochasticDualCoordinateAscent(
-                                      new SdcaBinaryTrainer.Options { 
+                                      new StochasticDualCoordinateAscentBinaryClassificationTrainer.Options { 


StochasticDualCoordinateAscentBinaryClassificationTrainer [](start = 42, length = 57)

StochasticDualCoordinateAscentBinaryClassification #Resolved

sfilipi · 2019-03-11T06:20:40Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Clustering/KMeansWithOptions.cs

-                .Append(ml.Clustering.Trainers.KMeans(
-                    new KMeansPlusPlusTrainer.Options
+                .Append(ml.Clustering.Trainers.KMeansPlusPlus(
+                    new KMeansPlusPlusClusteringTrainer.Options


KMeansPlusPlusClusteringTrainer [](start = 24, length = 31)

just KMeansPlusPlus maybe #Resolved

sfilipi · 2019-03-11T06:24:50Z

src/Microsoft.ML.FastTree/FastTreeRegression.cs

@@ -391,7 +391,7 @@ internal sealed class ObjectiveImpl : ObjectiveFunctionBase, IStepSearch
        {
            private readonly float[] _labels;

-            public ObjectiveImpl(Dataset trainData, RegressionGamTrainer.Options options) :
+            public ObjectiveImpl(Dataset trainData, GeneralizedAdditiveModelRegressionTrainer.Options options) :


GeneralizedAdditiveModelRegressionTrainer [](start = 52, length = 41)

GeneralizedAdditiveModelRegression #Resolved

wschin · 2019-03-12T00:48:36Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Clustering/KMeans.cs

@@ -27,7 +27,7 @@ public static void Example()
            // A pipeline for concatenating the age, parity and induced columns together in the Features column and training a KMeans model on them.
            string outputColumnName = "Features";
            var pipeline = ml.Transforms.Concatenate(outputColumnName, new[] { "Age", "Parity", "Induced" })
-                .Append(ml.Clustering.Trainers.KMeans(outputColumnName, numberOfClusters: 2));
+                .Append(ml.Clustering.Trainers.KMeansPlusPlus(outputColumnName, numberOfClusters: 2));


Suggested change

.Append(ml.Clustering.Trainers.KMeansPlusPlus(outputColumnName, numberOfClusters: 2));

.Append(ml.Clustering.Trainers.KMeans(outputColumnName, numberOfClusters: 2));

``` #Resolved

The algorithm that is actually implemented is KmeansPlusPlus . The underlying class is also called KMeansPlusPlus

This is also discussed here : #2762 (comment) #Resolved

The algorithm that is actually implemented is KmeansPlusPlus . The underlying class is also called KMeansPlusPlus

This is also discussed here : #2762 (comment)

In reply to: 264484798 [](ancestors = 264484798)

artidoro · 2019-03-12T01:15:44Z

There is also MetaMulticlassTrainer that needs to be renamed. #Resolved

artidoro · 2019-03-12T01:36:11Z

This is will solve part of #2623.

abgoswam · 2019-03-12T03:49:07Z

Tanks for pointing this out. This should be called MetaTrainer as per the summary #2762 (comment)

In reply to: 471806679 [](ancestors = 471806679)

sfilipi · 2019-03-12T04:16:32Z

src/Microsoft.ML.Ensemble/Trainer/Multiclass/MulticlassDataPartitionEnsembleTrainer.cs

@@ -64,8 +64,8 @@ public Arguments()
                            // non-default column names. Unfortuantely no method of resolving this temporary strikes me as being any
                            // less laborious than the proper fix, which is that this "meta" component should itself be a trainer
                            // estimator, as opposed to a regular trainer.
-                            var trainerEstimator = new MulticlassLogisticRegression(env, LabelColumnName, FeatureColumnName);
-                            return TrainerUtils.MapTrainerEstimatorToTrainer<MulticlassLogisticRegression,
+                            var trainerEstimator = new LogisticRegressionMulticlassClassificationTrainer(env, LabelColumnName, FeatureColumnName);


LogisticRegressionMulticlassClassificationTrainer [](start = 55, length = 49)

this will get ppl confused because it has both Regression and Multiclass on the name, but can't think of a good way to deal with it. would it be ok to just call it Logit @wschin @TomFinley

I would especially hesitate to call it Logit . Logits has a different interpretation related to the unnormalized log-probabilities

In reply to: 264516270 [](ancestors = 264516270)

sfilipi · 2019-03-12T04:34:04Z

test/BaselineOutput/Common/Command/CommandTrainMlrWithStats-summary.txt

@@ -1,4 +1,4 @@
-MulticlassLogisticRegression bias and non-zero weights
+LogisticRegressionMulticlassClassificationTrainer bias and non-zero weights


LogisticRegressionMulticlassClassificationTrainer [](start = 0, length = 49)

how can this be shorter... anyone in favor of dropping 'Classification'

sfilipi · 2019-03-12T04:35:43Z

src/Microsoft.ML.StaticPipe/SdcaStaticExtensions.cs

@@ -154,7 +154,7 @@ public static class SdcaStaticExtensions
            var rec = new TrainerEstimatorReconciler.BinaryClassifier(
                (env, labelName, featuresName, weightsName) =>
                {
-                    var trainer = new SdcaBinaryTrainer(env, labelName, featuresName, weightsName, l2Regularization, l1Threshold, numberOfIterations);
+                    var trainer = new SdcaCalibratedBinaryClassificationTrainer(env, labelName, featuresName, weightsName, l2Regularization, l1Threshold, numberOfIterations);


SdcaCalibratedBinaryClassificationTrainer [](start = 38, length = 41)

Did leave a note below, but i'd be ok dropping Classification from all BinaryClassification and MulticlassClassification.

note below .. am not getting you ? The changes here are as per what we summarized in #2762 (comment)

In reply to: 264518672 [](ancestors = 264518672)

sfilipi

eerhardt · 2019-03-12T14:24:34Z

src/Microsoft.ML.FastTree/TreeTrainersCatalog.cs

@@ -136,7 +136,7 @@ public static class TreeExtensions
        }

        /// <summary>
-        /// Predict a target using generalized additive models trained with the <see cref="BinaryClassificationGamTrainer"/>.
+        /// Predict a target using generalized additive models trained with the <see cref="GamBinaryClassificationTrainer"/>.


I wonder if we should connect the acronyms in the doc:

Predict a target using generalized additive models (GAM) trained with the <see cref="GamBinaryClassificationTrainer"/>. #Resolved

Thanks for the suggestion. Will fix. #Resolved

eerhardt · 2019-03-12T14:28:02Z

src/Microsoft.ML.KMeansClustering/KMeansCatalog.cs

@@ -26,7 +26,7 @@ public static class KMeansClusteringExtensions
        ///  [!code-csharp[KMeans](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Clustering/KMeans.cs)]
        /// ]]></format>
        /// </example>
-        public static KMeansPlusPlusTrainer KMeans(this ClusteringCatalog.ClusteringTrainers catalog,
+        public static KMeansPlusPlusTrainer KMeansPlusPlus(this ClusteringCatalog.ClusteringTrainers catalog,


I think we should be consistent everywhere about this name. If we really intend for this to be KMeansPlusPlus, then we should update all the places that just use KMeans:

The name of the above class:
public static class KMeansClusteringExtensions

The name of the assembly:
Microsoft.ML.KMeansClustering

Is there any confusion about just using the name KMeans? Are there other KMeans algorithms besides KMeans++? It feels like we should be fine just using KMeans, but I'll leave it up to the experts. #Resolved

Kmeans++ is a slightly modified version of Kmeans that has some smarts for choosing the initial cluster centers. The trainer estimator we have currently implements Kmeans++

For the specific trainer implementation we have currently, we are using KMeansPlusPlus for the MLContext name and KMeansPlusPlusTrainer the Class name

I can envision that in the future one might even want to implement the vanilla Kmeans algorithm itself. As such to me it makes sense to keep the name of the static class and assembly with just KMeans in the prefix.

In reply to: 264707009 [](ancestors = 264707009)

My feeling is that no one will add Kmeans as long as Kmeans++ exists. As you mentioned, Kmeans++ is a member of Kmeans family. Why can't we call it Kmeans? If someone want to implement the original Kmeans, it can be called NaiveKmeans. #Resolved

Sure. That makes sense. Will rename it to KMeans to keep things uniform #Resolved

Thanks. I also noticed there is a property InitializationAlgorithm to specify the initialization mechanism

So yeah. It should be just KMeans

In reply to: 264770186 [](ancestors = 264770186)

fixed. We should indeed call it KMeans

In reply to: 264757754 [](ancestors = 264757754,264707009)

eerhardt · 2019-03-12T14:31:33Z

src/Microsoft.ML.StandardTrainers/Standard/MultiClass/MetaMulticlassTrainer.cs

@@ -15,7 +15,7 @@ namespace Microsoft.ML.Trainers
 {
    using TScalarTrainer = ITrainerEstimator<ISingleFeaturePredictionTransformer<IPredictorProducing<float>>, IPredictorProducing<float>>;

-    public abstract class MetaMulticlassTrainer<TTransformer, TModel> : ITrainerEstimator<TTransformer, TModel>, ITrainer<IPredictor>
+    public abstract class MetaTrainer<TTransformer, TModel> : ITrainerEstimator<TTransformer, TModel>, ITrainer<IPredictor>


Here's a special case of our rule:

{TypeOfTask} is added only only when the algorithm supports multiple kinds of tasks

I don't believe MetaTrainer is a good name here since it too general. And technically, Meta isn't an algorithm. That name makes it sound like it can train anything. I think we should keep "multiclass" in the name. So probably MetaMulticlassClassificationTrainer. #Closed

Yeah. It makes sense to treat this as a special case. #Resolved

eerhardt · 2019-03-12T14:33:33Z

src/Microsoft.ML.StandardTrainers/StandardTrainersCatalog.cs


    /// <summary>
    /// TrainerEstimator extension methods.
    /// </summary>
    public static class StandardTrainersCatalog
    {
        /// <summary>
-        /// Predict a target using a linear classification model trained with <see cref="SgdBinaryTrainer"/>.
+        /// Predict a target using a linear classification model trained with <see cref="SgdCalibratedTrainer"/>.


Here's a place where the acronym Sgd isn't expanded out in the summary comments. #Resolved

eerhardt · 2019-03-12T20:52:22Z

test/Microsoft.ML.Tests/Scenarios/Api/CookbookSamples/CookbookSamplesDynamicApi.cs

@@ -105,7 +105,7 @@ private void TrainRegression(string trainDataPath, string testDataPath, string m
                // once so adding a caching step before it is not helpful.
                .AppendCacheCheckpoint(mlContext)
                // Add the SDCA regression trainer.
-                .Append(mlContext.Regression.Trainers.StochasticDualCoordinateAscent(labelColumnName: "Target", featureColumnName: "FeatureVector"));
+                .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Target", featureColumnName: "FeatureVector"));


when we update the cookbook samples files, it usually means the cookbook needs to be updated. #Resolved

thanks for pointing this out!

In reply to: 264879257 [](ancestors = 264879257)

eerhardt

abgoswam · 2019-03-12T22:19:57Z

Thanks folks for the review comments!

abgoswam added 2 commits March 11, 2019 04:10

renaming several trainers

4bfaab7

updating some of the trainers with acronyms

a843f9b

abgoswam requested review from shmoradims, Ivanidzo4ka, TomFinley and eerhardt March 11, 2019 05:16

sfilipi reviewed Mar 11, 2019

View reviewed changes

abgoswam added 2 commits March 11, 2019 21:44

updated names based on the latest pattern

6def207

fix merge conflicts

cfae81d

wschin reviewed Mar 12, 2019

View reviewed changes

artidoro mentioned this pull request Mar 12, 2019

One name for MulticlassClassification #2919

Merged

fix name of MetaTrainer

b749064

sfilipi reviewed Mar 12, 2019

View reviewed changes

update to latest master

31e7e00

sfilipi approved these changes Mar 12, 2019

View reviewed changes

eerhardt reviewed Mar 12, 2019

View reviewed changes

fix review comments

cbd9eaa

abgoswam added 2 commits March 12, 2019 17:55

connect acronym for SGD

3d577e9

fix merge conflict

81c684a

abgoswam requested a review from eerhardt March 12, 2019 20:08

eerhardt reviewed Mar 12, 2019

View reviewed changes

eerhardt approved these changes Mar 12, 2019

View reviewed changes

updates to cookbook markdown file

0ebd4a6

abgoswam merged commit 7f0c1ad into dotnet:master Mar 12, 2019

abgoswam mentioned this pull request Mar 12, 2019

The trainer name types should follow the names used in the contexts #2172

Closed

abgoswam deleted the abgoswam/trainerestimator_names branch March 20, 2019 20:13

ghost locked as resolved and limited conversation to collaborators Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing names of trainer estimators #2903

Fixing names of trainer estimators #2903

abgoswam commented Mar 11, 2019

codecov bot commented Mar 11, 2019 •

edited

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

abgoswam Mar 11, 2019

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

wschin Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

abgoswam Mar 12, 2019

artidoro commented Mar 12, 2019 •

edited by abgoswam

Loading

artidoro commented Mar 12, 2019

abgoswam commented Mar 12, 2019 •

edited

Loading

sfilipi Mar 12, 2019

abgoswam Mar 12, 2019

sfilipi Mar 12, 2019

sfilipi Mar 12, 2019 •

edited

Loading

abgoswam Mar 12, 2019 •

edited

Loading

sfilipi left a comment

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

wschin Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

abgoswam Mar 12, 2019

abgoswam Mar 12, 2019

eerhardt Mar 12, 2019 •

edited

Loading

abgoswam Mar 12, 2019 •

edited

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019

eerhardt left a comment

abgoswam commented Mar 12, 2019

	.Append(ml.Clustering.Trainers.KMeansPlusPlus(outputColumnName, numberOfClusters: 2));
	.Append(ml.Clustering.Trainers.KMeans(outputColumnName, numberOfClusters: 2));
	``` #Resolved

		@@ -1,4 +1,4 @@
		MulticlassLogisticRegression bias and non-zero weights
		LogisticRegressionMulticlassClassificationTrainer bias and non-zero weights

Fixing names of trainer estimators #2903

Fixing names of trainer estimators #2903

Conversation

abgoswam commented Mar 11, 2019

codecov bot commented Mar 11, 2019 • edited Loading

Codecov Report

sfilipi Mar 11, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Mar 11, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

sfilipi Mar 11, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

sfilipi Mar 11, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

sfilipi Mar 11, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

wschin Mar 12, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

abgoswam Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

artidoro commented Mar 12, 2019 • edited by abgoswam Loading

artidoro commented Mar 12, 2019

abgoswam commented Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

abgoswam Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

sfilipi left a comment

Choose a reason for hiding this comment

eerhardt Mar 12, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

abgoswam Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

eerhardt Mar 12, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

abgoswam Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

wschin Mar 12, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

abgoswam Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eerhardt Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

abgoswam Mar 12, 2019 • edited Loading

Choose a reason for hiding this comment

eerhardt Mar 12, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

eerhardt Mar 12, 2019 • edited by abgoswam Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eerhardt left a comment

Choose a reason for hiding this comment

abgoswam commented Mar 12, 2019

codecov bot commented Mar 11, 2019 •

edited

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

sfilipi Mar 11, 2019 •

edited by abgoswam

Loading

wschin Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

artidoro commented Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam commented Mar 12, 2019 •

edited

Loading

sfilipi Mar 12, 2019 •

edited

Loading

abgoswam Mar 12, 2019 •

edited

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

wschin Mar 12, 2019 •

edited by abgoswam

Loading

abgoswam Mar 12, 2019 •

edited

Loading

eerhardt Mar 12, 2019 •

edited

Loading

abgoswam Mar 12, 2019 •

edited

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading

eerhardt Mar 12, 2019 •

edited by abgoswam

Loading