Skip to content

Multiple feature columns in FFM #2205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 23, 2019
Merged

Multiple feature columns in FFM #2205

merged 5 commits into from
Jan 23, 2019

Conversation

wschin
Copy link
Member

@wschin wschin commented Jan 22, 2019

This PR somehow provides a solution to #2179 regarding FFM via allowing multiple feature column names in advanced trainer arguments.

Strategy: adding one extra field to Arguments of FFM; that filed is

            /// <summary>
            /// Extra feature column names. The column named <see cref="LearnerInputBase.FeatureColumn"/> stores features from the first field.
            /// The i-th string in <see cref="ExtraFeatureColumns"/> stores the name of the (i+1)-th field's feature column.
            /// </summary>
            [Argument(ArgumentType.Multiple, HelpText = "Extra columns to use for feature vectors. The i-th specified string denotes the column containing features form the (i+1)-th field." +
                " Note that the first field is specified by \"feat\" instead of \"exfeat\".",
                ShortName = "exfeat", SortOrder = 7)]
            public string[] ExtraFeatureColumns;

@wschin wschin self-assigned this Jan 22, 2019
@wschin wschin requested review from abgoswam and sfilipi January 22, 2019 22:48
Field1 = new float[_simpleBinaryClassSampleFeatureLength],
Field2 = new float[_simpleBinaryClassSampleFeatureLength] };
// Fill feature vector according the assigned label.
for (int j = 0; j < 10; ++j)
Copy link
Member

@abgoswam abgoswam Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 [](start = 36, length = 2)

_simpleBinaryClassSampleFeatureLength ? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Fixed.


In reply to: 249993941 [](ancestors = 249993941)

{
public bool Label;

[VectorType(_simpleBinaryClassSampleFeatureLength)]
Copy link
Member

@abgoswam abgoswam Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VectorType [](start = 13, length = 10)

am curious - is this attribute required ? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required.


In reply to: 249994977 [](ancestors = 249994977)

abgoswam
abgoswam previously approved these changes Jan 22, 2019
Copy link
Member

@abgoswam abgoswam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@abgoswam abgoswam dismissed their stale review January 23, 2019 18:01

revoking review


// Customized the field names.
ffmArgs.FeatureColumn = nameof(DatasetUtils.FfmExample.Field0); // First field.
ffmArgs.ExtraFeatureColumns = new[]{ nameof(DatasetUtils.FfmExample.Field1), nameof(DatasetUtils.FfmExample.Field2) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtraFeatureColumns [](start = 20, length = 19)

this looks slightly odd . isn't it ?

am curious -- why move away from the convention used in iteration #2, where we were re-defining FeatureColumn as a string[]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Framework of generating entry points generates the hidden field (old feature column name) and therefore we have two fields with the same name and an error.


In reply to: 250309618 [](ancestors = 250309618)

@abgoswam
Copy link
Member

    public readonly SchemaShape.Column[] FeatureColumns;

internal.

We do not want this exposed especially since it holds content from Options.FeatureColumn and Options.ExtraFeatureColumns


Refers to: src/Microsoft.ML.StandardLearners/FactorizationMachine/FactorizationMachineTrainer.cs:93 in e454a19. [](commit_id = e454a19, deletion_comment = False)

Copy link
Member

@abgoswam abgoswam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

FeatureColumns[0] = new SchemaShape.Column(args.FeatureColumn, SchemaShape.Column.VectorKind.Vector, NumberType.R4, false);

// Add 2nd, 3rd, and other fields from a FFM-specific argument, args.ExtraFeatureColumns.
for (int i = 0; args.ExtraFeatureColumns != null && i < args.ExtraFeatureColumns.Length; i++)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (int i = 0; args.ExtraFeatureColumns != null && i < args.ExtraFeatureColumns.Length; i++) [](start = 12, length = 93)

could this be a simple foreach loop instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it can't. The array size is pre-defined and I prefer not to create an intermediate list just for calling Add.


In reply to: 250316802 [](ancestors = 250316802)

@wschin
Copy link
Member Author

wschin commented Jan 23, 2019

    public readonly SchemaShape.Column[] FeatureColumns;

It's not related to Options. It's a state of a trainer which can be specified by multiple (I guess two) ways.


In reply to: 456913047 [](ancestors = 456913047)


Refers to: src/Microsoft.ML.StandardLearners/FactorizationMachine/FactorizationMachineTrainer.cs:93 in e454a19. [](commit_id = e454a19, deletion_comment = False)

Copy link

@shmoradims shmoradims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@wschin wschin merged commit 053062b into dotnet:master Jan 23, 2019
@wschin wschin deleted the adv-ffm-args branch January 23, 2019 18:38
@ghost ghost locked as resolved and limited conversation to collaborators Mar 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants