Skip to content

Fixing names of trainer estimators #2903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 12, 2019

Conversation

abgoswam
Copy link
Member

Fixes #2762 and #2172

@codecov
Copy link

codecov bot commented Mar 11, 2019

Codecov Report

Merging #2903 into master will decrease coverage by <.01%.
The diff coverage is 86.71%.

@@            Coverage Diff             @@
##           master    #2903      +/-   ##
==========================================
- Coverage   71.82%   71.82%   -0.01%     
==========================================
  Files         812      812              
  Lines      142719   142719              
  Branches    16092    16092              
==========================================
- Hits       102513   102510       -3     
- Misses      35827    35830       +3     
  Partials     4379     4379
Flag Coverage Δ
#Debug 71.82% <86.71%> (-0.01%) ⬇️
#production 67.97% <77.77%> (ø) ⬆️
#test 86.21% <92.63%> (-0.01%) ⬇️
Impacted Files Coverage Δ
...ers/Standard/MultiClass/PairwiseCouplingTrainer.cs 90.07% <ø> (ø) ⬆️
src/Microsoft.ML.FastTree/GamClassification.cs 89% <ø> (ø) ⬆️
...L.Mkl.Components/ComputeLRTrainingStdThroughHal.cs 92.85% <ø> (ø) ⬆️
...rainers/Standard/MultiClass/OneVersusAllTrainer.cs 74.63% <ø> (ø) ⬆️
src/Microsoft.ML.FastTree/FastTreeRegression.cs 54.5% <ø> (ø) ⬆️
src/Microsoft.ML.FastTree/GamRegression.cs 89.09% <ø> (ø) ⬆️
...crosoft.ML.StandardTrainers/Standard/SdcaBinary.cs 72.68% <ø> (ø) ⬆️
src/Microsoft.ML.LightGBM/LightGbmArguments.cs 89.63% <ø> (ø) ⬆️
...Microsoft.ML.Mkl.Components/OlsLinearRegression.cs 66.3% <0%> (ø) ⬆️
....ML.Benchmarks/KMeansAndLogisticRegressionBench.cs 0% <0%> (ø) ⬆️
... and 77 more

@@ -28,15 +28,15 @@ public static void Example()
// Convert the List<DataPoint> to IDataView, a consumble format to ML.NET functions.
var data = mlContext.Data.LoadFromEnumerable(samples);

var options = new ML.Trainers.RandomizedPrincipalComponentAnalyzer.Options()
var options = new ML.Trainers.RandomizedPcaAnomalyDetectionTrainer.Options()
Copy link
Member

@sfilipi sfilipi Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RandomizedPcaAnomalyDetectionTrainer [](start = 42, length = 36)

can we leave "trainer" out of the name? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used acronym for Pca. Kept suffix Trainer in the name of the class


In reply to: 264101278 [](ancestors = 264101278)

@@ -21,7 +21,7 @@ public static void Example()
var trainTestData = mlContext.BinaryClassification.TrainTestSplit(data, testFraction: 0.1);

// Define the trainer options.
var options = new AveragedPerceptronTrainer.Options()
var options = new AveragedPerceptronBinaryClassificationTrainer.Options()
Copy link
Member

@sfilipi sfilipi Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AveragedPerceptronBinaryClassificationTrainer [](start = 30, length = 45)

I vote for AveragePerceptron, since in this case, there aren't one for each task. #Resolved

@@ -61,7 +61,7 @@ public static void Example()
// we could do so by tweaking the 'advancedSetting'.
var advancedPipeline = mlContext.Transforms.Text.FeaturizeText("SentimentText", "Features")
.Append(mlContext.BinaryClassification.Trainers.StochasticDualCoordinateAscent(
new SdcaBinaryTrainer.Options {
new StochasticDualCoordinateAscentBinaryClassificationTrainer.Options {
Copy link
Member

@sfilipi sfilipi Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StochasticDualCoordinateAscentBinaryClassificationTrainer [](start = 42, length = 57)

StochasticDualCoordinateAscentBinaryClassification #Resolved

.Append(ml.Clustering.Trainers.KMeans(
new KMeansPlusPlusTrainer.Options
.Append(ml.Clustering.Trainers.KMeansPlusPlus(
new KMeansPlusPlusClusteringTrainer.Options
Copy link
Member

@sfilipi sfilipi Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KMeansPlusPlusClusteringTrainer [](start = 24, length = 31)

just KMeansPlusPlus maybe #Resolved

@@ -391,7 +391,7 @@ internal sealed class ObjectiveImpl : ObjectiveFunctionBase, IStepSearch
{
private readonly float[] _labels;

public ObjectiveImpl(Dataset trainData, RegressionGamTrainer.Options options) :
public ObjectiveImpl(Dataset trainData, GeneralizedAdditiveModelRegressionTrainer.Options options) :
Copy link
Member

@sfilipi sfilipi Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeneralizedAdditiveModelRegressionTrainer [](start = 52, length = 41)

GeneralizedAdditiveModelRegression #Resolved

@@ -27,7 +27,7 @@ public static void Example()
// A pipeline for concatenating the age, parity and induced columns together in the Features column and training a KMeans model on them.
string outputColumnName = "Features";
var pipeline = ml.Transforms.Concatenate(outputColumnName, new[] { "Age", "Parity", "Induced" })
.Append(ml.Clustering.Trainers.KMeans(outputColumnName, numberOfClusters: 2));
.Append(ml.Clustering.Trainers.KMeansPlusPlus(outputColumnName, numberOfClusters: 2));
Copy link
Member

@wschin wschin Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Append(ml.Clustering.Trainers.KMeansPlusPlus(outputColumnName, numberOfClusters: 2));
.Append(ml.Clustering.Trainers.KMeans(outputColumnName, numberOfClusters: 2));
``` #Resolved

Copy link
Member Author

@abgoswam abgoswam Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm that is actually implemented is KmeansPlusPlus . The underlying class is also called KMeansPlusPlus

This is also discussed here : #2762 (comment) #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm that is actually implemented is KmeansPlusPlus . The underlying class is also called KMeansPlusPlus

This is also discussed here : #2762 (comment)


In reply to: 264484798 [](ancestors = 264484798)

@artidoro
Copy link
Contributor

artidoro commented Mar 12, 2019

There is also MetaMulticlassTrainer that needs to be renamed. #Resolved

@artidoro
Copy link
Contributor

This is will solve part of #2623.

@abgoswam
Copy link
Member Author

abgoswam commented Mar 12, 2019

Tanks for pointing this out. This should be called MetaTrainer as per the summary #2762 (comment)


In reply to: 471806679 [](ancestors = 471806679)

@@ -64,8 +64,8 @@ public Arguments()
// non-default column names. Unfortuantely no method of resolving this temporary strikes me as being any
// less laborious than the proper fix, which is that this "meta" component should itself be a trainer
// estimator, as opposed to a regular trainer.
var trainerEstimator = new MulticlassLogisticRegression(env, LabelColumnName, FeatureColumnName);
return TrainerUtils.MapTrainerEstimatorToTrainer<MulticlassLogisticRegression,
var trainerEstimator = new LogisticRegressionMulticlassClassificationTrainer(env, LabelColumnName, FeatureColumnName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogisticRegressionMulticlassClassificationTrainer [](start = 55, length = 49)

this will get ppl confused because it has both Regression and Multiclass on the name, but can't think of a good way to deal with it. would it be ok to just call it Logit @wschin @TomFinley

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would especially hesitate to call it Logit . Logits has a different interpretation related to the unnormalized log-probabilities


In reply to: 264516270 [](ancestors = 264516270)

@@ -1,4 +1,4 @@
MulticlassLogisticRegression bias and non-zero weights
LogisticRegressionMulticlassClassificationTrainer bias and non-zero weights
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogisticRegressionMulticlassClassificationTrainer [](start = 0, length = 49)

how can this be shorter... anyone in favor of dropping 'Classification'

@@ -154,7 +154,7 @@ public static class SdcaStaticExtensions
var rec = new TrainerEstimatorReconciler.BinaryClassifier(
(env, labelName, featuresName, weightsName) =>
{
var trainer = new SdcaBinaryTrainer(env, labelName, featuresName, weightsName, l2Regularization, l1Threshold, numberOfIterations);
var trainer = new SdcaCalibratedBinaryClassificationTrainer(env, labelName, featuresName, weightsName, l2Regularization, l1Threshold, numberOfIterations);
Copy link
Member

@sfilipi sfilipi Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SdcaCalibratedBinaryClassificationTrainer [](start = 38, length = 41)

Did leave a note below, but i'd be ok dropping Classification from all BinaryClassification and MulticlassClassification.

Copy link
Member Author

@abgoswam abgoswam Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note below .. am not getting you ? The changes here are as per what we summarized in #2762 (comment)


In reply to: 264518672 [](ancestors = 264518672)

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -136,7 +136,7 @@ public static class TreeExtensions
}

/// <summary>
/// Predict a target using generalized additive models trained with the <see cref="BinaryClassificationGamTrainer"/>.
/// Predict a target using generalized additive models trained with the <see cref="GamBinaryClassificationTrainer"/>.
Copy link
Member

@eerhardt eerhardt Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should connect the acronyms in the doc:

Predict a target using generalized additive models (GAM) trained with the <see cref="GamBinaryClassificationTrainer"/>. #Resolved

Copy link
Member Author

@abgoswam abgoswam Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Will fix. #Resolved

@@ -26,7 +26,7 @@ public static class KMeansClusteringExtensions
/// [!code-csharp[KMeans](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Clustering/KMeans.cs)]
/// ]]></format>
/// </example>
public static KMeansPlusPlusTrainer KMeans(this ClusteringCatalog.ClusteringTrainers catalog,
public static KMeansPlusPlusTrainer KMeansPlusPlus(this ClusteringCatalog.ClusteringTrainers catalog,
Copy link
Member

@eerhardt eerhardt Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be consistent everywhere about this name. If we really intend for this to be KMeansPlusPlus, then we should update all the places that just use KMeans:

The name of the above class:
public static class KMeansClusteringExtensions

The name of the assembly:
Microsoft.ML.KMeansClustering

Is there any confusion about just using the name KMeans? Are there other KMeans algorithms besides KMeans++? It feels like we should be fine just using KMeans, but I'll leave it up to the experts. #Resolved

Copy link
Member Author

@abgoswam abgoswam Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kmeans++ is a slightly modified version of Kmeans that has some smarts for choosing the initial cluster centers. The trainer estimator we have currently implements Kmeans++

For the specific trainer implementation we have currently, we are using KMeansPlusPlus for the MLContext name and KMeansPlusPlusTrainer the Class name

I can envision that in the future one might even want to implement the vanilla Kmeans algorithm itself. As such to me it makes sense to keep the name of the static class and assembly with just KMeans in the prefix.


In reply to: 264707009 [](ancestors = 264707009)

Copy link
Member

@wschin wschin Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is that no one will add Kmeans as long as Kmeans++ exists. As you mentioned, Kmeans++ is a member of Kmeans family. Why can't we call it Kmeans? If someone want to implement the original Kmeans, it can be called NaiveKmeans. #Resolved

Copy link
Member Author

@abgoswam abgoswam Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. That makes sense. Will rename it to KMeans to keep things uniform #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I also noticed there is a property InitializationAlgorithm to specify the initialization mechanism

So yeah. It should be just KMeans


In reply to: 264770186 [](ancestors = 264770186)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. We should indeed call it KMeans


In reply to: 264757754 [](ancestors = 264757754,264707009)

@@ -15,7 +15,7 @@ namespace Microsoft.ML.Trainers
{
using TScalarTrainer = ITrainerEstimator<ISingleFeaturePredictionTransformer<IPredictorProducing<float>>, IPredictorProducing<float>>;

public abstract class MetaMulticlassTrainer<TTransformer, TModel> : ITrainerEstimator<TTransformer, TModel>, ITrainer<IPredictor>
public abstract class MetaTrainer<TTransformer, TModel> : ITrainerEstimator<TTransformer, TModel>, ITrainer<IPredictor>
Copy link
Member

@eerhardt eerhardt Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a special case of our rule:

{TypeOfTask} is added only only when the algorithm supports multiple kinds of tasks

I don't believe MetaTrainer is a good name here since it too general. And technically, Meta isn't an algorithm. That name makes it sound like it can train anything. I think we should keep "multiclass" in the name. So probably MetaMulticlassClassificationTrainer. #Closed

Copy link
Member Author

@abgoswam abgoswam Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It makes sense to treat this as a special case. #Resolved


/// <summary>
/// TrainerEstimator extension methods.
/// </summary>
public static class StandardTrainersCatalog
{
/// <summary>
/// Predict a target using a linear classification model trained with <see cref="SgdBinaryTrainer"/>.
/// Predict a target using a linear classification model trained with <see cref="SgdCalibratedTrainer"/>.
Copy link
Member

@eerhardt eerhardt Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a place where the acronym Sgd isn't expanded out in the summary comments. #Resolved

@abgoswam abgoswam requested a review from eerhardt March 12, 2019 20:08
@@ -105,7 +105,7 @@ private void TrainRegression(string trainDataPath, string testDataPath, string m
// once so adding a caching step before it is not helpful.
.AppendCacheCheckpoint(mlContext)
// Add the SDCA regression trainer.
.Append(mlContext.Regression.Trainers.StochasticDualCoordinateAscent(labelColumnName: "Target", featureColumnName: "FeatureVector"));
.Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Target", featureColumnName: "FeatureVector"));
Copy link
Member

@eerhardt eerhardt Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we update the cookbook samples files, it usually means the cookbook needs to be updated. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing this out!


In reply to: 264879257 [](ancestors = 264879257)

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@abgoswam abgoswam merged commit 7f0c1ad into dotnet:master Mar 12, 2019
@abgoswam
Copy link
Member Author

Thanks folks for the review comments!

@abgoswam abgoswam deleted the abgoswam/trainerestimator_names branch March 20, 2019 20:13
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants