Added OneVersusAll and PairwiseCoupling samples. #3159

ganik · 2019-04-01T21:42:14Z

Part of #2522.
Adds a sample for OneVersusAll classification.
Adds a sample for PairwiseCoupling classification.

codecov · 2019-04-01T22:48:22Z

Codecov Report

Merging #3159 into master will increase coverage by 0.1%.
The diff coverage is n/a.

@@            Coverage Diff            @@
##           master    #3159     +/-   ##
=========================================
+ Coverage   72.53%   72.64%   +0.1%     
=========================================
  Files         808      807      -1     
  Lines      144740   145080    +340     
  Branches    16202    16213     +11     
=========================================
+ Hits       104986   105391    +405     
+ Misses      35343    35271     -72     
- Partials     4411     4418      +7

Flag	Coverage Δ
#Debug	`72.64% <ø> (+0.1%)`	⬆️
#production	`68.19% <ø> (+0.07%)`	⬆️
#test	`88.92% <ø> (+0.1%)`	⬆️

Impacted Files	Coverage Δ
...oft.ML.StandardTrainers/StandardTrainersCatalog.cs	`92.34% <ø> (+3.27%)`	⬆️
src/Microsoft.ML.DataView/KeyDataViewType.cs	`74.57% <0%> (-3.76%)`	⬇️
...rosoft.ML.Data/Scorers/PredictedLabelScorerBase.cs	`81.71% <0%> (-0.62%)`	⬇️
src/Microsoft.ML.Data/Transforms/ValueMapping.cs	`84.26% <0%> (-0.14%)`	⬇️
test/Microsoft.ML.Tests/ImagesTests.cs	`98.69% <0%> (-0.13%)`	⬇️
src/Microsoft.ML.Transforms/CategoricalCatalog.cs	`68.42% <0%> (ø)`	⬆️
...osoft.ML.Recommender/SafeTrainingAndModelBuffer.cs	`78.87% <0%> (ø)`	⬆️
...ML.Tests/TrainerEstimators/MetalinearEstimators.cs	`100% <0%> (ø)`	⬆️
src/Microsoft.ML.Data/Transforms/Normalizer.cs	`86.03% <0%> (ø)`	⬆️
...Microsoft.ML.Transforms/FeatureSelectionCatalog.cs	`60% <0%> (ø)`	⬆️
... and 28 more

Ivanidzo4ka · 2019-04-01T23:51:03Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs

+{
+    public static class OneVersusAll
+    {
+        public static void Example()


Example [](start = 27, length = 7)

you probably want to link this file to extension method.

/// <format type="text/markdown"> /// <![CDATA[ /// [!code-csharp[SDCA](~/../docs/samples/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/StochasticDualCoordinateAscentWithOptions.cs)] /// ]]></format> /// </example> ``` #Resolved

sfilipi · 2019-04-02T05:16:53Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs

+                    // Convert the string labels into key types.
+                    mlContext.Transforms.Conversion.MapValueToKey("Label")
+                    // Apply OneVersusAll multiclass trainer on top of SDCA Logistic Regression binary trainer.
+                    .Append(mlContext.MulticlassClassification.Trainers.OneVersusAll(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression()));


usually multiclass pipelines add a MapKeyToValue at the end. #ByDesign

maybe, I dont need it here

In reply to: 271136620 [](ancestors = 271136620)

sfilipi · 2019-04-02T05:17:11Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs

+            // Train the model.
+            var model = pipeline.Fit(split.TrainSet);
+
+            // Do prediction on the test set.


Do [](start = 15, length = 2)

Generate #Resolved

sfilipi · 2019-04-02T05:17:38Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs

+            //   Micro Accuracy: 0.77
+            //   Macro Accuracy: 0.75
+            //   Log Loss: 0.69
+            //   Log Loss Reduction: 0.49


// Log Loss Reduction: 0.49 [](start = 12, length = 29)

add reading the PredictedLabel column. #Resolved

sfilipi · 2019-04-02T05:18:18Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/PairwiseCoupling.cs

+            //   Micro Accuracy: 0.75
+            //   Macro Accuracy: 0.73
+            //   Log Loss: 0.70
+            //   Log Loss Reduction: 0.49


same here, add reading the PredictedLabel. #Resolved

done

In reply to: 271136850 [](ancestors = 271136850)

shmoradims · 2019-04-02T19:29:54Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs

+            var mlContext = new MLContext(seed: 0);
+
+            // Create a list of data examples.
+            var examples = DatasetUtils.GenerateRandomMulticlassClassificationExamples(1000);


GenerateRandomMulticlassClassificationExamples [](start = 40, length = 46)

is it possible to use something like GenerateRandomDataPoints that's used in binary classification?

machinelearning/docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/FastTree.cs

Line 68 in 0ce5618

private static IEnumerable<DataPoint> GenerateRandomDataPoints(int count, int seed=0)

#Resolved

This method generates data points for multi class, with 4 labels. Except for this it does do it similarly to the GenerateRandomDataPoints. Not sure what other similarity you would want?

In reply to: 271461899 [](ancestors = 271461899)

this is actually done, thx

In reply to: 271497239 [](ancestors = 271497239,271461899)

shmoradims · 2019-04-02T19:35:08Z

using Microsoft.ML.Data;

consider using T4 templates if you see a lot of duplicate code across multi-class samples. Below is a T4 we used for regression. All *.cs files will be autogenerated from the .tt template file.
#3099 #Resolved

Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs:1 in 01dccde. [](commit_id = 01dccde, deletion_comment = False)

ganik · 2019-04-02T22:23:48Z

using Microsoft.ML.Data;

Good suggestions, done.

In reply to: 479163541 [](ancestors = 479163541)

Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/OneVersusAll.cs:1 in 01dccde. [](commit_id = 01dccde, deletion_comment = False)

Add NaiveBayes

sfilipi · 2019-04-03T19:56:25Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/NaiveBayes.cs

+        private class DataPoint
+        {
+            public uint Label { get; set; }
+            [VectorType(20)]


[VectorType(20)] [](start = 11, length = 17)

is the annotation necessary? #ByDesign

yes, needed for schema check

In reply to: 271907075 [](ancestors = 271907075)

sfilipi

shmoradims · 2019-04-03T21:20:12Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/NaiveBayes.cs

+            var metrics = mlContext.MulticlassClassification.Evaluate(transformedTestData);
+            SamplesUtils.ConsoleUtils.PrintMetrics(metrics);
+
+            // Expected output:


// Expected output: [](start = 12, length = 19)

how come this line is repeated? #Resolved

shmoradims · 2019-04-03T21:21:15Z

...soft.ML.Samples/Dynamic/Trainers/MulticlassClassification/MulticlassClassification.ttinclude

+            // Look at 5 predictions
+            foreach (var p in predictions.Take(5))
+                Console.WriteLine($"Label: {p.Label}, Prediction: {p.PredictedLabel}");
+


please add ExpectedOutputPerInstance after this #Resolved

shmoradims · 2019-04-03T21:22:30Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/NaiveBayes.tt

+string Comments= "";
+
+string ExpectedOutputPerInstance= @"// Expected output:
+            //   Label: 1, Prediction: 2


Label: 1, Prediction: 2 [](start = 17, length = 23)

how come generated labels are 0,1,2 but here I see 1,2,3. how did they get changed? #Resolved

no, generated labels are 1,2,3

In reply to: 271938136 [](ancestors = 271938136)

shmoradims · 2019-04-03T21:23:47Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/NaiveBayes.tt

+
+string ExpectedOutput = @"// Expected output:
+            // Expected output:
+            // Micro Accuracy: 0.35


0.35 [](start = 30, length = 5)

can we get something above 60%? this is much worse that the other two. #ByDesign

I tried, we cant, this is a linear model :)

In reply to: 271938623 [](ancestors = 271938623)

shmoradims · 2019-04-03T21:28:52Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/MulticlassClassification/NaiveBayes.cs

+            // Micro Accuracy: 0.35
+            // Macro Accuracy: 0.33
+            // Log Loss: 34.54
+            // Log Loss Reduction: -30.47


we usually indent the lines below Expected output with an extra space. #Resolved

shmoradims · 2019-04-05T21:50:11Z

...soft.ML.Samples/Dynamic/Trainers/MulticlassClassification/MulticlassClassification.ttinclude

+<#=OptionsInclude#>
+<# } #>
+
+namespace Microsoft.ML.Samples.Dynamic.Trainers.MulticlassClassification


Microsoft.ML [](start = 10, length = 12)

please drop Microsoft.ML prefix as per #3205 #Resolved

done

In reply to: 272754353 [](ancestors = 272754353)

shmoradims

OVA sample

5a17e9b

shmoradims mentioned this pull request Apr 1, 2019

Docs and samples for the API reference site (P0 & P1 Trainers) #2522

Closed

Add PairwiseCoupling sample

01dccde

ganik changed the title ~~OneVersusAll sample~~ Add OneVersusAll and PairwiseCoupling samples Apr 1, 2019

ganik changed the title ~~Add OneVersusAll and PairwiseCoupling samples~~ Added OneVersusAll and PairwiseCoupling samples Apr 1, 2019

ganik changed the title ~~Added OneVersusAll and PairwiseCoupling samples~~ Added OneVersusAll and PairwiseCoupling samples. Apr 1, 2019

ganik requested review from shmoradims, sfilipi and Ivanidzo4ka April 1, 2019 23:17

Ivanidzo4ka reviewed Apr 1, 2019

View reviewed changes

sfilipi reviewed Apr 2, 2019

View reviewed changes

shmoradims reviewed Apr 2, 2019

View reviewed changes

Use tt templates

391969f

Add example link to extension methods.

f08e9bb

Add NaiveBayes

sfilipi reviewed Apr 3, 2019

View reviewed changes

sfilipi approved these changes Apr 3, 2019

View reviewed changes

shmoradims reviewed Apr 3, 2019

View reviewed changes

ganik added 3 commits April 5, 2019 10:28

fix comments

209877d

fix comments

9f2bf05

fix comments

f36302a

ganik added 2 commits April 5, 2019 10:48

Add [BestFriend] for GraphRunner

5add22d

rollback BestFriend

d8d7e98

shmoradims reviewed Apr 5, 2019

View reviewed changes

fix comments

4cf6199

ganik mentioned this pull request Apr 5, 2019

NaiveBayes doesnt produce meaningful result on simple dataset #3226

Closed

remove NB for now

589b0a8

shmoradims approved these changes Apr 5, 2019

View reviewed changes

ganik merged commit f19b560 into dotnet:master Apr 5, 2019

ghost locked as resolved and limited conversation to collaborators Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added OneVersusAll and PairwiseCoupling samples. #3159

Added OneVersusAll and PairwiseCoupling samples. #3159

ganik commented Apr 1, 2019 •

edited

Loading

codecov bot commented Apr 1, 2019 •

edited

Loading

Ivanidzo4ka Apr 1, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

ganik Apr 2, 2019

sfilipi Apr 2, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

ganik Apr 2, 2019

shmoradims Apr 2, 2019 •

edited by ganik

Loading

ganik Apr 2, 2019

ganik Apr 2, 2019

shmoradims commented Apr 2, 2019 •

edited by ganik

Loading

ganik commented Apr 2, 2019

sfilipi Apr 3, 2019 •

edited by ganik

Loading

ganik Apr 5, 2019

sfilipi left a comment

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

ganik Apr 3, 2019

shmoradims Apr 3, 2019 •

edited by ganik

Loading

ganik Apr 3, 2019

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 5, 2019 •

edited by ganik

Loading

ganik Apr 5, 2019

shmoradims left a comment

Added OneVersusAll and PairwiseCoupling samples. #3159

Added OneVersusAll and PairwiseCoupling samples. #3159

Conversation

ganik commented Apr 1, 2019 • edited Loading

codecov bot commented Apr 1, 2019 • edited Loading

Codecov Report

Ivanidzo4ka Apr 1, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

sfilipi Apr 2, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi Apr 2, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

sfilipi Apr 2, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

sfilipi Apr 2, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shmoradims Apr 2, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shmoradims commented Apr 2, 2019 • edited by ganik Loading

ganik commented Apr 2, 2019

sfilipi Apr 3, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfilipi left a comment

Choose a reason for hiding this comment

shmoradims Apr 3, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

shmoradims Apr 3, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

shmoradims Apr 3, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shmoradims Apr 3, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shmoradims Apr 3, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

shmoradims Apr 5, 2019 • edited by ganik Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shmoradims left a comment

Choose a reason for hiding this comment

ganik commented Apr 1, 2019 •

edited

Loading

codecov bot commented Apr 1, 2019 •

edited

Loading

Ivanidzo4ka Apr 1, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

sfilipi Apr 2, 2019 •

edited by ganik

Loading

shmoradims Apr 2, 2019 •

edited by ganik

Loading

shmoradims commented Apr 2, 2019 •

edited by ganik

Loading

sfilipi Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 3, 2019 •

edited by ganik

Loading

shmoradims Apr 5, 2019 •

edited by ganik

Loading