-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Adding the extension methods for FastTree classification and regression #1009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Contracts.CheckParam(learningRates > 0, nameof(learningRates), "Must be positive"); | ||
Contracts.CheckValueOrNull(advancedSettings); | ||
Contracts.CheckValueOrNull(onFit); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't duplicate #Resolved
TrainerUtils.CheckArgsAndAdvancedSettingMismatch(numTrees, snapshot.NumTrees, Args.NumTrees, nameof(numTrees)); | ||
TrainerUtils.CheckArgsAndAdvancedSettingMismatch(minDocumentsInLeafs, snapshot.MinDocumentsInLeafs, Args.MinDocumentsInLeafs, nameof(minDocumentsInLeafs)); | ||
TrainerUtils.CheckArgsAndAdvancedSettingMismatch(learningRates, snapshot.LearningRates, Args.LearningRates, nameof(learningRates)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move in FastTreeUtils and re-use. #Resolved
// if, after applying the advancedArgs delegate, the args are different that the default value | ||
// and are also different than the value supplied directly to the xtension method, warn the user. | ||
if (!setting.Equals(defaultVal) && !setting.Equals(methodParam)) | ||
Console.WriteLine($"The value supplied to advanced settings , is different than the value supplied directly. Using value {setting} for {argName}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Console.WriteLine [](start = 15, length = 18)
I don't like this. Can we accept Host or Channel as first object and use it? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for sure; put it to look for a Warn method, or pass the channel, and forgot to replace.
In reply to: 220018209 [](ancestors = 220018209)
public static void CheckArgumesDefaultColNames(string defaultColName, string argValue) | ||
{ | ||
if (argValue != defaultColName) | ||
throw Contracts.Except($"Don't supply a value for the {defaultColName} column in the arguments, as it will be ignored. Specify them in the loader, or constructor instead instead."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everywhere you call this you have host or environment. Why you don't want to use it? #Closed
string labelColumn, | ||
string featureColumn, | ||
string weightColumn = null, | ||
int numLeaves = 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 [](start = 28, length = 2)
Should it be part of static Defaults class? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have Default class, but you don't use it here.
In reply to: 220063796 [](ancestors = 220063796,220018807)
{ | ||
public static Scalar<float> FastTree(this RegressionContext.RegressionTrainers ctx, | ||
Scalar<float> label, Vector<float> features, Scalar<float> weights = null, | ||
int numLeaves = 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 [](start = 28, length = 2)
Defaults? #Closed
Contracts.CheckParam(!(l1Threshold < 0), nameof(l1Threshold), "Must not be negative"); | ||
Contracts.CheckParam(!(maxIterations < 1), nameof(maxIterations), "Must be positive if specified"); | ||
Contracts.CheckParam(l2Const >= 0, nameof(l2Const), "Must not be negative"); | ||
Contracts.CheckParam(l1Threshold >= 0, nameof(l1Threshold), "Must not be negative"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l1Threshold >= 0 [](start = 33, length = 16)
How it works if l1Threshold is null? #Closed
@@ -99,6 +99,13 @@ public abstract class FastTreeTrainerBase<TArgs, TTransformer, TModel> : | |||
|
|||
//apply the advanced args, if the user supplied any | |||
advancedSettings?.Invoke(Args); | |||
|
|||
// check that the users didn't specify different label, group, feature, weights in the args, from what they supplied directly | |||
TrainerUtils.CheckArgumesDefaultColNames(DefaultColumnNames.Label, Args.LabelColumn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argumes [](start = 30, length = 7)
typo #Resolved
int numLeaves = 20, | ||
int numTrees = 100, | ||
int minDocumentsInLeafs = 10, | ||
double learningRates = 0.2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
learningRates [](start = 19, length = 13)
just one learning rate will suffice #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i bet it is enough, but tried keeping them to the same names the args use :)
In reply to: 220029597 [](ancestors = 220029597)
var snapshot = new Arguments(); | ||
|
||
// Check that the user didn't supply different parameters in the args, from what it specified directly. | ||
TrainerUtils.CheckArgsAndAdvancedSettingMismatch(numLeaves, snapshot.NumLeaves, Args.NumLeaves, nameof(numLeaves)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CheckArgsAndAdvancedSettingMismatch [](start = 29, length = 35)
actually this is not necessary. If they are so cool to set the values in 2 places, that's fine.
I wanted it to be checked on columns, since these are ignored. The other args, you set them first, then you invoke delegate and you're good. #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, this is trying to catch the case where they set one value in one place, and another one in the other. No exceptions, just Warn them about which value will be used.
In reply to: 220029873 [](ancestors = 220029873)
int minDocumentsInLeafs = 10, | ||
double learningRates = 0.2, | ||
Action<FastTreeBinaryClassificationTrainer.Arguments> advancedSettings = null, | ||
Action<IPredictorWithFeatureWeights<float>> onFit = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IPredictorWithFeatureWeights [](start = 19, length = 28)
that's the best we could do here? I would argue that we should expose some form of tree ensemble predictor or something #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it for now and log an issue about it. Just casting didn't work.
In reply to: 220030085 [](ancestors = 220030085)
@@ -59,10 +59,10 @@ public static class FactorizationMachineStatic | |||
var trainer = new FieldAwareFactorizationMachineTrainer(env, labelCol, featureCols, advancedSettings: | |||
args => | |||
{ | |||
advancedSettings?.Invoke(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
advancedSettings [](start = 24, length = 16)
my order was intentional :) #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning defaults to a few SDCA parameters.
int numLeaves = Defaults.NumLeaves, | ||
int numTrees = Defaults.NumTrees, | ||
int minDocumentsInLeafs = Defaults.MinDocumentsInLeafs, | ||
double learningRate = Defaults.LearningRates, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixxxx #Resolved
{ | ||
public static class FastTreeStatic | ||
{ | ||
public static Scalar<float> FastTree(this RegressionContext.RegressionTrainers ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's my time to comment what you need comments for public method! Oh, the joy of revenge!111 #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did them, before you commented!!!! see the next push :)
In reply to: 220283064 [](ancestors = 220283064)
@@ -7,13 +7,14 @@ | |||
<ProjectReference Include="..\..\src\Microsoft.ML.ImageAnalytics\Microsoft.ML.ImageAnalytics.csproj" /> | |||
<ProjectReference Include="..\..\src\Microsoft.ML.StandardLearners\Microsoft.ML.StandardLearners.csproj" /> | |||
<ProjectReference Include="..\Microsoft.ML.TestFramework\Microsoft.ML.TestFramework.csproj" /> | |||
<NativeAssemblyReference Include="FactorizationMachineNative" /> | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXTRA SPACES! #Closed
<ProjectReference Include="..\..\src\Microsoft.ML.Analyzer\Microsoft.ML.Analyzer.csproj"> | ||
<ReferenceOutputAssembly>false</ReferenceOutputAssembly> | ||
<OutputItemType>Analyzer</OutputItemType> | ||
</ProjectReference> | ||
|
||
<NativeAssemblyReference Include="CpuMathNative" /> | ||
<NativeAssemblyReference Include="FactorizationMachineNative" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see tests for FastTree added, but nothing regarding FM #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was already there. I just moved the reference a few lines down, to keep it together with the other native references.
In reply to: 220283652 [](ancestors = 220283652)
Removing defaults from arguments of helper methods.
More common code set apart.
<NativeAssemblyReference Include="LdaNative" /> | ||
<NativeAssemblyReference Include="FactorizationMachineNative" /> | ||
<NativeAssemblyReference Include="FastTreeNative" /> | ||
<NativeAssemblyReference Include="LdaNative" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[](start = 0, length = 1)
tab #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tools->Options->Text Editor->All Languages->Insert Spaces
Checked!
In reply to: 220301837 [](ancestors = 220301837)
<ProjectReference Include="..\Microsoft.ML.TestFramework\Microsoft.ML.TestFramework.csproj" /> | ||
<NativeAssemblyReference Include="FactorizationMachineNative" /> | ||
|
||
<ProjectReference Include="..\Microsoft.ML.TestFramework\Microsoft.ML.TestFramework.csproj" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[](start = 98, length = 4)
is...
this...
a...
EXTRA SPACES? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ion inside the advancedSettings, since they take precendence.
OnGoing work to address #754.
Added the extensions to the Binary and Regression context for the respective FastTree trainers.
Ranking coming on a separate PR, since it needs a bit more work with the context and elevator changes.