Tree estimators #855

sfilipi · 2018-09-07T21:25:17Z

Ongoing work on converting the trainers to estimators. This PR converts the Tree -type Predictors.

sfilipi · 2018-09-07T21:25:54Z

I will add tests next. We don't seem to have many ranking tests enabled :( #Resolved

sfilipi · 2018-09-07T21:38:24Z

src/Microsoft.ML.FastTree/FastTreeRanking.cs

@@ -405,6 +414,9 @@ protected override string GetTestGraphHeader()
            return headerBuilder.ToString();
        }

+        protected override RankingPredictionTransformer<FastTreeRankingPredictor> MakeTransformer(FastTreeRankingPredictor model, ISchema trainSchema)
+        => new RankingPredictionTransformer<FastTreeRankingPredictor>(Host, model, trainSchema, FeatureColumn.Name);


FeatureColumn.Name); [](start = 96, length = 20)

should add the GroupID to the base constructor #Resolved

GroupID? why?

In reply to: 216093781 [](ancestors = 216093781)

Changing the behavior for the creation of the weight column, based on whether it is explicit, or implicit.

sfilipi · 2018-09-08T17:18:55Z

src/Microsoft.ML.FastTree/FastTree.cs

@@ -133,6 +136,18 @@ protected virtual Float GetMaxLabel()
            return Float.PositiveInfinity;
        }

+        private static SchemaShape.Column MakeWeightColumn(Optional<string> weightColumn)
+        {
+            if (weightColumn == null || !weightColumn.IsExplicit)


|| !weightColumn.IsExplicit [](start = 37, length = 27)

this is not entirely correct either. It won't create the column when the user doesn't specify the weight colum, because it already had the name weight in the data.
we can't peak at the data at this time.

@[email protected] @Zruty0 can we move from the Optional to just string for the weight, name, group ID and enforce the user typing in the names? is there another way around it, now that we need to know the information before seeing the data? #Resolved

we cannot do this really, can we?

In reply to: 216135820 [](ancestors = 216135820)

Zruty0 · 2018-09-13T01:31:34Z

test/Microsoft.ML.Tests/Scenarios/Api/Estimators/SimpleTrainAndPredict.cs

+        /// (e.g., the prediction does not happen over a file as it did during training).
+        /// </summary>
+        [Fact]
+        public void New_SimpleTrainAndPredictWithFT()


New_SimpleTrainAndPredictWithFT [](start = 20, length = 31)

move this test somewhere else #Resolved

Zruty0 · 2018-09-13T01:32:37Z

src/Microsoft.ML.FastTree/FastTreeRanking.cs

@@ -15,6 +15,7 @@
 using Microsoft.ML.Runtime.Internal.Utilities;
 using Microsoft.ML.Runtime.Model;
 using Microsoft.ML.Runtime.Internal.Internallearn;
+using Microsoft.ML.Core.Data;


using [](start = 0, length = 5)

sort #Resolved

sfilipi · 2018-09-13T19:56:02Z

src/Microsoft.ML.FastTree/FastTreeRanking.cs

+            {
+                new SchemaShape.Column(DefaultColumnNames.Score, SchemaShape.Column.VectorKind.Scalar, NumberType.R4, false),
+                new SchemaShape.Column(DefaultColumnNames.Probability, SchemaShape.Column.VectorKind.Scalar, NumberType.R4, false),
+                new SchemaShape.Column(DefaultColumnNames.PredictedLabel, SchemaShape.Column.VectorKind.Scalar, BoolType.Instance, false)


double-check this is correct

adding test

… on the MakeGroupId

Making use of dataset definitions adding Iris.data and the adult.tiny files to TestDatasets adding regression and ranking tests

sfilipi · 2018-09-14T23:20:38Z

test/Microsoft.ML.Tests/TrainerEstimators/TreeEstimators.cs

+        /// FastTreeBinaryClassification TrainerEstimator test 
+        /// </summary>
+        [Fact]
+        public void FastTreeRankerEstimator()


public void FastTreeRankerEstimator() [](start = 7, length = 38)

this is currently failing. #Resolved

TomFinley · 2018-09-17T20:25:25Z

src/Microsoft.ML.Data/Scorers/PredictionTransformer.cs

@@ -301,6 +301,52 @@ private static VersionInfo GetVersionInfo()
        }
    }

+    public sealed class RankingPredictionTransformer<TModel> : PredictionTransformerBase<TModel>


RankingPredictionTransformer [](start = 24, length = 28)

Is the reason why we have two types that are identical in practically everything but name, so we can identify ranking estimators vs. regression estimators in a statically typed way?

I think this transformer should also expose the group ID column name, at least that would be my belief

In reply to: 218214277 [](ancestors = 218214277)

Actually thought about this, like labels group ids are only needed for training, right? So for prediction I don't think they should be.

In reply to: 218216192 [](ancestors = 218216192,218214277)

So keep it, or make the Regression one Generic and use it for both?

In reply to: 218216839 [](ancestors = 218216839,218216192,218214277)

TomFinley · 2018-09-17T20:28:22Z

src/Microsoft.ML.Ensemble/OutputCombiners/MultiStacking.cs

@@ -57,7 +57,7 @@ public Arguments()
                    env => new Ova(env, new Ova.Arguments()
                    {
                        PredictorType = ComponentFactoryUtils.CreateFromFunction(
-                             e => new AveragedPerceptronTrainer(e, new AveragedPerceptronTrainer.Arguments()))
+                            e => new FastTreeBinaryClassificationTrainer(e, DefaultColumnNames.Label, DefaultColumnNames.Features))


FastTreeBinaryClassificationTrainer [](start = 37, length = 35)

I'd really rather we didn't. This seems to fit into the same bucket as the discussion on #682. That ensembling should have a dependency on FastTree merely because we have a default does not make sense to me. If someone wants to use stacking, that's great, but they need to specify the learners. #Pending

But maybe we can hold off for right now.

In reply to: 218215145 [](ancestors = 218215145)

Yes, let's do that separately, when we shape the ensembles to take in the arguments in the constructor.

In reply to: 218215323 [](ancestors = 218215323,218215145)

TomFinley · 2018-09-17T20:31:16Z

src/Microsoft.ML.FastTree/FastTree.cs

@@ -25,6 +25,8 @@
 using Microsoft.ML.Runtime.Training;
 using Microsoft.ML.Runtime.TreePredictor;
 using Newtonsoft.Json.Linq;
+using Microsoft.ML.Core.Data;
+using Microsoft.ML.Runtime.EntryPoints;


I'm probably just missing something obvious, but why does this now depend on entry-points namespace?

Also sorting. #Resolved

Thank you! Oversight

In reply to: 218216150 [](ancestors = 218216150)

TomFinley

TomFinley · 2018-09-17T20:36:04Z

Is omission of Pigsty extensions deliberate?

sfilipi · 2018-09-17T21:18:02Z

Did i misunderstand that for trainers we should hold on to doing the Pigsty extensions until we get the ml task, so we could extend on that, rather than the label? @[email protected] @Zruty0, let me know if i should actually work on them in the same PR.

In reply to: 422161730 [](ancestors = 422161730)

Zruty0

Zruty0 · 2018-09-17T21:28:04Z

I have the same (mis)understanding. In any case, let's do it after this oner

In reply to: 422174998 [](ancestors = 422174998,422161730)

…the other tests.

Fixing the other two tests

sfilipi added 2 commits September 7, 2018 11:46

moving FastTree derving classes to TrainerEstimatorBase

cbd84ab

fixing the RankingScorer

0f68992

sfilipi requested review from TomFinley, Zruty0 and Ivanidzo4ka September 7, 2018 21:25

sfilipi self-assigned this Sep 7, 2018

sfilipi added the API Issues pertaining the friendly API label Sep 7, 2018

sfilipi added this to the 0918 milestone Sep 7, 2018

sfilipi changed the title ~~WIP: Fast tree estimators~~ WIP: Tree estimators Sep 7, 2018

Zruty0 mentioned this pull request Sep 7, 2018

New API for ML.NET #754

Closed

sfilipi commented Sep 7, 2018

View reviewed changes

Adding one test, defining the output columns.

c76285c

Changing the behavior for the creation of the weight column, based on whether it is explicit, or implicit.

sfilipi commented Sep 8, 2018

View reviewed changes

sfilipi added 2 commits September 10, 2018 10:17

Merge branch 'master' into fastTreeEstimators

13a187e

Post merge fixes

8e76ef1

Zruty0 reviewed Sep 13, 2018

View reviewed changes

Merge branch 'master' into fastTreeEstimators

e5f8925

sfilipi commented Sep 13, 2018

View reviewed changes

sfilipi added 5 commits September 13, 2018 18:29

arguments applied via the delegate

f2410a6

adding test

Updated the test to use TestEstimatorCore, and fixed the null pointer…

574b9d2

… on the MakeGroupId

using the new constructors in the codebase.

4b3da66

Making use of dataset definitions adding Iris.data and the adult.tiny files to TestDatasets adding regression and ranking tests

merging from master

63e63d5

adding the metadata

0cccda5

sfilipi commented Sep 14, 2018

View reviewed changes

TomFinley reviewed Sep 17, 2018

View reviewed changes

TomFinley approved these changes Sep 17, 2018

View reviewed changes

Zruty0 approved these changes Sep 17, 2018

View reviewed changes

tweaking the test

5073a90

sfilipi changed the title ~~WIP: Tree estimators~~ Tree estimators Sep 18, 2018

sfilipi added 7 commits September 17, 2018 17:59

merging from master

edc39d5

resolving merge conflicts, and disabling the ranker test to check on …

66eaf76

…the other tests.

Fixing the signature on the RankerPredictor

2d8b525

Fixing the other two tests

Fixing regressions and tests

df5449a

merging from master

e770ddc

switching dataset

e8fb048

post merge fixes

7ce1bd3

sfilipi merged commit d13b415 into dotnet:master Sep 19, 2018

sfilipi mentioned this pull request Sep 21, 2018

Test changed incorrectly #983

Closed

sfilipi deleted the fastTreeEstimators branch October 22, 2018 16:57

ghost locked as resolved and limited conversation to collaborators Mar 29, 2022

Tree estimators #855

Tree estimators #855

Uh oh!

Conversation

sfilipi commented Sep 7, 2018

Uh oh!

sfilipi commented Sep 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sfilipi Sep 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zruty0 Sep 7, 2018

Choose a reason for hiding this comment

Uh oh!

sfilipi Sep 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zruty0 Sep 13, 2018

Choose a reason for hiding this comment

Uh oh!

Zruty0 Sep 13, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zruty0 Sep 13, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi Sep 13, 2018

Choose a reason for hiding this comment

Uh oh!

sfilipi Sep 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Sep 17, 2018

Choose a reason for hiding this comment

Uh oh!

Zruty0 Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi Sep 18, 2018

Choose a reason for hiding this comment

Uh oh!

TomFinley Sep 17, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Sep 17, 2018

Choose a reason for hiding this comment

Uh oh!

sfilipi Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Sep 17, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi Sep 18, 2018

Choose a reason for hiding this comment

Uh oh!

TomFinley left a comment

Choose a reason for hiding this comment

Uh oh!

TomFinley commented Sep 17, 2018

Uh oh!

sfilipi commented Sep 17, 2018

Uh oh!

Zruty0 left a comment

Choose a reason for hiding this comment

Uh oh!

sfilipi commented Sep 7, 2018 •

edited

Loading

sfilipi Sep 7, 2018 •

edited

Loading

sfilipi Sep 8, 2018 •

edited

Loading

Zruty0 Sep 13, 2018 •

edited by sfilipi

Loading

Zruty0 Sep 13, 2018 •

edited by sfilipi

Loading

sfilipi Sep 14, 2018 •

edited

Loading

Zruty0 Sep 17, 2018 •

edited

Loading

TomFinley Sep 17, 2018 •

edited

Loading

TomFinley Sep 17, 2018 •

edited by sfilipi

Loading

sfilipi Sep 17, 2018 •

edited

Loading

TomFinley Sep 17, 2018 •

edited by sfilipi

Loading