-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Mismatch in Label expectations for Multiclass Learners #2628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I can be wrong, but it looks like our decision making regarding auto-working-magically workflow for API is quite opposite. We no longer have auto cache, no auto normalization. |
I have list of things I want to do to all learners #2613 if you during your discovery find something which doesn't fit into your list, can you extend it? |
SDCA in fact does not convert from float automatically. It was instead written at a time years ago when the only type of label supported was float. Now that we have a more specific type we should shift to using it. It does not convert at all, it just supports both types. If you prefer it to be consistent, it should be consistent in the other direction, that is, SDCA should stop doing its internal magic. That is, @Ivanidzo4ka's interpretation is correct. We do not do "auto-conversion" for anything in ML.NET because (1) conversion is the job of transforms and (2) adding the smarts "anywhere" requires that we add them "everywhere," which is a maintainability nightmare. (In fact, it has proven to be practically impossible.) We have an internal style guide that treats on this subject at some length. Well intentioned but utterly misguided people often think, "hey I know, I'll be super helpful and just support a bunch of types," but all it leads to is an inconsistent, broken feeling API. It is not precisely the same situation, since that deals with |
I like having consistency. Let's consider this a bug on SDCA for supporting |
Just to make it clear, are we fixing this now or post v1? |
We should fix now, as it breaks a key training scenario that any trainer for a task can substituted in a training pipeline. |
Correct, we must fix now. |
When training a multiclass classifier, the label must be converted to
KeyType
. Some learners, such asSDCA
, do the conversion fromR4
automatically, while some, likeLogisticRegression
do not. This behavior is confusing from an end-user perspective, as one cannot simply interchange learners without modifying the input pipeline.I suggest adding auto-label-conversion for all multiclass learners in ML.NET.
The text was updated successfully, but these errors were encountered: