-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Replace SubComponent with IComponentFactory in ML.Ensemble #681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
77fff03
84119d3
7c06f93
d41dbe6
6f852cb
2beb2ce
4923971
78a4178
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,8 @@ | |
using Microsoft.ML.Runtime.Data; | ||
using Microsoft.ML.Runtime.Ensemble.OutputCombiners; | ||
using Microsoft.ML.Runtime.EntryPoints; | ||
using Microsoft.ML.Runtime.FastTree; | ||
using Microsoft.ML.Runtime.Internal.Internallearn; | ||
using Microsoft.ML.Runtime.Model; | ||
|
||
[assembly: LoadableClass(typeof(RegressionStacking), typeof(RegressionStacking.Arguments), typeof(SignatureCombiner), | ||
|
@@ -19,7 +21,7 @@ namespace Microsoft.ML.Runtime.Ensemble.OutputCombiners | |
{ | ||
using TScalarPredictor = IPredictorProducing<Single>; | ||
|
||
public sealed class RegressionStacking : BaseScalarStacking<SignatureRegressorTrainer>, IRegressionOutputCombiner, ICanSaveModel | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What are we going to do about this sort of restriction? I had suggested previously using something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which restriction are you referring to? The Signature Type restriction, where we can't use My hope is that some day we remove signature types all together. I haven't thought of/discovered a good replacement for them yet. Maybe an approach similar to ComponentKind which is just a string "Kind". However, that still wouldn't help in this case, because we need a different signature/kind/etc for each concrete class, but we want to define the argument on the base class. #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant just restricting the DI and API usage to only consider actual, in this case, regression trainers, which is one of the functions of the signatures. Currently this is done in a fairly half-hearted fashion using this In reply to: 210463209 [](ancestors = 210463209) |
||
public sealed class RegressionStacking : BaseScalarStacking, IRegressionOutputCombiner, ICanSaveModel | ||
{ | ||
public const string LoadName = "RegressionStacking"; | ||
public const string LoaderSignature = "RegressionStacking"; | ||
|
@@ -37,9 +39,17 @@ private static VersionInfo GetVersionInfo() | |
[TlcModule.Component(Name = LoadName, FriendlyName = Stacking.UserName)] | ||
public sealed class Arguments : ArgumentsBase, ISupportRegressionOutputCombinerFactory | ||
{ | ||
[Argument(ArgumentType.Multiple, HelpText = "Base predictor for meta learning", ShortName = "bp", SortOrder = 50, | ||
Visibility = ArgumentAttribute.VisibilityType.CmdLineOnly, SignatureType = typeof(SignatureRegressorTrainer))] | ||
[TGUI(Label = "Base predictor")] | ||
public IComponentFactory<ITrainer<TScalarPredictor>> BasePredictorType; | ||
|
||
internal override IComponentFactory<ITrainer<TScalarPredictor>> GetPredictorFactory() => BasePredictorType; | ||
|
||
public Arguments() | ||
{ | ||
BasePredictorType = new SubComponent<ITrainer<TScalarPredictor>, SignatureRegressorTrainer>("FastTreeRegression"); | ||
BasePredictorType = ComponentFactoryUtils.CreateFromFunction( | ||
env => new FastTreeRegressionTrainer(env, new FastTreeRegressionTrainer.Arguments())); | ||
} | ||
|
||
public IRegressionOutputCombiner CreateComponent(IHostEnvironment env) => new RegressionStacking(env, this); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this due to the usage of FastTree as the default traniner for some ensemble classes. This could be fine, but I wonder if using something from
StandardLearners
might make a bit more sense, if we envision that someday FastTree will be factored into a separate nuget package. (Maybe we don't want that though, in which case this dependency is probably fine.) #ResolvedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One advantage to this change is that it brings to light these dependencies. Before we had a weakly-typed dependency using strings and DI. So if we ever did refactor FastTree to be separate, we could have easily broken this dependency.
I've logged #682 for this issue. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eerhardt ... yes I expect we'll see some more of this "hidden dependency" becoming exposed as we move away from direct usage of subcomponents and dependency injection. I recall a similar issue in #446 with the normalizers being defined in
Microsoft.ML.Transforms
and used inMicrosoft.ML.Data
, despite the former project depending on the latter. 😄 #Resolved