-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Direct API discussion: ITrainer proposed changes #509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In current design, I have an easy way to detect, does this algorithm support Validation or not, by just looking on class interfaces. Should we preserve it and add |
Hi @Ivanidzo4ka , yes, I'd think such a mechanism would be necessary. Maybe something akin to |
I'd be in favor of the Just throwing another possibility out for discussion: A separate design question on this hierarchy: Why do we have the need for |
Thanks @TomFinley for the proposal. These are the reasonable changes to make API more convenient for end-users. I am fine with both options. However, the following option seems more convenient as one knows in a glance what this method expects and what is expected from this method. This won't be clear in other option unless one inspects public interface ITrainer<TDataset, TPredictor> {
TPredictor Train(TDataset train, TDataset validation = null, TDataset testSet = null, IPredictor initPredictor = null); } I suggest that we can follow One thing I see in both options is that some parameters won't make any sense for some trainers. E.g. not all trainers support validation during training. Even user pass the |
Keep the virtual contract as simple as possible, in that sense the context solution is better. Extension methods on the interface could add convenience without changing the virtual contract. eg: static class TrainerExtensions
{
public static TPredictor Train(this ITrainer<TDataSet,TPredictor> trainer, TDataset train, ...)
{
return trainer.Train(new TrainContext()
{
Train = train,
...
};
}
} Alternatively, as @eerhardt mentioned, an abstract base class can have a single virtual with concrete implementations of the convenience methods. That's how we would typically do this in the framework (assuming "Trainer" was the primary behavior of the object and you didn't need the multiple-inheritance characteristics you'd get from an interface). |
Hmmm OK. An extension method, perhaps? (Ah, that was funny, as I finished writing this sentence, I see @ericstj just replied, with the idea of using extension methods. 😄 )
Indeed. I think though that @Ivanidzo4ka 's suggestion of @ericstj yes something like that. In this case however |
Oh I somehow missed that @eerhardt had replied before @zeahmed hi Eric. 👋 The use of an actual abstract base class may be possible. We of course were using multiple inheritance in this very thing (since an Edit: That would also resolve the strange thing with
I considered proposing this as well. It may be a good idea... the only time that generic interface has been useful is when we've changed it, but when we've changed it, we've changed it everywhere... e.g., |
Just to summarize, this is what I get from @eerhardt, @ericstj, @zeahmed, and @Ivanidzo4ka into a more or less basic POR.
|
TIL that you can't have covariant type parameters on classes, only on interfaces and delegates. 😦 Hey @eerhardt, I guess I never actually tried before otherwise I might have known, but covariant type parameters can't be put on classes... you can do So while a That's really unfortunate. I was rather looking forward to nuking Edit: Maybe, we can have a covariant interface implemented by Edit edit: No, that's a silly idea. There should be exactly one abstraction. If adopting the second abstraction is necessary to overcome the limitations of the first, then we should not have adopted the first abstraction. |
I agree this is unfortunate. Maybe we could re-evaluate whether we need the Instead, we could have a design that went something like this (obviously I'm skipping a bunch of details, but just giving the overall approach idea): public abstract class Trainer
{
public Predictor Train(TrainContext context)
{
return this.TrainPredictor(context);
}
protected abstract Predictor TrainPredictor(TrainContext context);
}
public class LinearTrainer : Trainer
{
public new LinearPredictor Train(TrainContext context)
{
// real logic happens
return new LinearPredictor();
}
protected override Predictor TrainPredictor(TrainContext context)
{
return this.Train(context);
}
}
public class TrainContext {}
public class Predictor {}
public class LinearPredictor : Predictor {} The idea being - if I have a The advantage of this approach is that consumers don't need to deal with a generic type at all. If they want to write general logic that can take any Trainer/Predictor pair, they code at the base class level. If they know the specific type they are working with - they can cast down to that concrete type and get the extra methods/properties/etc. The way we have it today, with |
Yes I see. Allow me to share some feedback on that proposal, based on how such trainers are actually used. (Just a warning: I am going to disagree with you, but I hope it is evident that I respect your position, I consider your positions very well considered and thought out, and my disagreement was not easy. I really did genuinely want there to be a solution based on an abstract base class, and it is only after much thought that I have to consider that arrangement impossible and undesirable.) You have posited the existence of a hierarchy of trainers... Happily, that multiple inheritance problem is already covered on the This in turn leads me to, perhaps, not fully agree with the point about a user somehow not being "helped" by generics (your point in the paragraph beginning "The advantage of this approach is that consumers don't need to deal with a generic type at all"), since the alternative that is envisioned is that there would be an inheritance hierarchy on trainers that mirrors, without being in any way syntactically bound by, the same inheritance structure found on predictors. I can't say I like that solution very much, not only because of its intrinsic complexity but also because the problem of semantically-parallel-but-syntactical-unrelated inheritance structures it invites is precisely the principal problem generics were invented to solve. In the .NET framework we don't have This is probably enough. However, there is also dependency injection to consider, which touches upon some further out plans that I have. (As you might be able to tell from my issues and PRs, this issue by itself is just one of a longer term plan in the service of #371 .) Being able to say "tell me all trainers that produce this type of predictor" is something used a lot. We actually currently do this via some scheme called "signatures," e.g., here, but this is IMO a poor solution. I have a secondary plan to replace this idiom, as part of a broader plan to do away with ML.NET's custom-coded dependency injection framework. But I'm going to have to replace it with something, and unfortunately all possibly candidate somethings require establishing an Architectural considerations addressed, let's move on to more casual user-facing considerations:
I don't object per se, but for these people, happily, given that we are now not going to make the dataset type generic and instead just say The reason I don't object however is purely on practical grounds since it is a moot point. The hypothetical population of people you're talking about that would care about this issue are those sophisticated enough to realize that there is such a thing as a top level |
There are two changes proposed here for
ITrainer
. Due to the nature of the changes, assuming we agree they are good ideas, it really makes sense that they ought to happen in one go, since they involve changes to the same core functionality.I am quite certain the first thing is a good idea, but the second thing I am less certain about. (Of course my confidence could be misplaced. 😄) People that occur to me as potentially being good contributors to the discussion would be @eerhardt , @zeahmed , @shauheen , @ericstj , @glebuk . (To be clear, this is not exclusionary. Anyone can and should feel free to comment. I just want these people to get flagged is all. 😄 )
ITrainer<TData, TPred>.Train
ought to return the predictorCurrently in order to train a predictor, there is a two step process. You call
ITrainer.Train
then callITrainer.GetPredictor
. As near as I can tell this arrangement was meant as some scheme to support online training, but of course that vision never came to pass, and if we were to actually support online training I have to imagine it would be through some separate interface anyway.This arrangement seems unambiguously bad. It necessitates making
ITrainer
objects stateful for no particularly good reason. This complicates both the implementation and usage of the class, since (1) the caller can't do things like callTrain
multiple times even though a developer, seeing this interface, might reasonably suppose such a thing were possible and (2) the author of theITrainer
component has to protect against that misuse.Get rid of
IValidatingTrainer
,IIncrementalTrainer
,IValidatingIncrementalTrainer
The problem
First let's talk about the problem...
Most (all?) trainers implement
ITrainer<RoleMappedData>
, based on this interface here.machinelearning/src/Microsoft.ML.Core/Prediction/ITrainer.cs
Line 105 in 828dc22
We have historically followed adding more inputs to the training process by declaring specialized interfaces that represent the Cartesian product of all possible permutations of inputs, as we see:
There was a discussion about adding a validation set. So we doubled the number of interfaces to two,
ITrainer
, andIValidatingTrainer
.Later on there was a discussion about adding continued training based on an initialization. So we again doubled the number of interfaces, introducing
IIncrementalTrainer
andIValidatingIncrementalTrainer
.There have been discussions of adding a test set to allow computing metrics as training progresses. Following the same strategy we would of course again double the number of interfaces (the full set being represented, perhaps, by
IValidatingIncrementalTestingTrainer
), for a total of eight.If hypothetically we were to somehow allow for one more input beyond that, we'd have a total of sixteen interfaces.
Etc. etc. That there is this exponential cost makes clear something is misdesigned. This has cost not only here and in the implementations of
ITrainer
, but in the usage as well. Here we see a method that explores the cartesian product of possible interfaces so it can call the right one. It seems to me something is wrong here when just calling "train" requires a fairly non-obvious utility method to make sure we call the "right" train.This issue incidentally is the primary reason why we haven't done anything like add support for test set metrics during training (despite the many requests). That is, it is not any technical difficulty with the idea itself, it's just that writing such a thing would make the code unmaintainable.
The possible solution(s)
So: instead we might just have one interface, with one required input (the training dataset), and all these other things are optional.
There are two obvious ways I could imagine doing this, first explicitly as part of the method signature on
ITrainer<...>
:Or else have some sort of context object. (I'm not married to any of these names, to be clear. 😄 )
and all trainers implement
ITrainer<TrainContext>
instead ofITrainer<RoleMappedData>
.The latter is perhaps a bit more awkward since it involves the addition of a new abstraction (the hypothetical
TrainContext
), but it is more flexible in a forward-looking sense, since if we add more "stuff" to how we initialize trainers, we won't break all existingITrainer
implementations. (My expectation is that trainers that can't support something would simply ignore.)The text was updated successfully, but these errors were encountered: