-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Number of feature columns #2179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently the public API of Field-aware factorization machine accepts a string[] as features: Lines 27 to 33 in bafd40c
However, the advanced arguments for FFM (aka This leads to inconsistency in the public API of FieldAwareFactorizationMachine, when we separate out the advanced arguments in a separate API (Related to the work we are doing in #1798) |
If one algorithm (say field-aware factorization machine) can accept multiple feature columns, and other algorithms (say SDCA) can only accept a single feature column, I don't see a reason why the APIs across the two need to be consistent. Why limit FFM to only allow one column when it can support many? |
I'd expect to be consistency between "basic" and "advanced" arguments. Different functionality between different trainers require different type for features. Having string[] featureColumns in both method and arg class seems reasonable. |
Single-feature assumption doesn't sound great to me. By searching for references of FFM, we can find some advances are using multiple feature columns. Just for reference, LIBFFM has more than 1k stars but merely implements a single algorithm for training binary classification FFM. |
Closing out this issue. PR #2205 fixed it. |
It has been a while that ML.NET assumes only one feature column can exist in a training pipeline. Recently, we have added field-aware factorization machine so that argument becomes not 100% correct. We will only have only two public APIs per trainer (please see #2047 as an example). To make our public APIs consistent, we need to determine if feature column name should be an array or a scalar. Or we can introduce another API which accepts multiple feature (even label) columns. @TomFinley, @eerhardt, any comments please?
The text was updated successfully, but these errors were encountered: