-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Updated CopyColumns, DropColumns and SelectColumns samples. #3268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// Create a small dataset as an IEnumerable. | ||
var samples = new List<InputData>() | ||
{ | ||
new InputData(){ ImageId = 1, Features = new [] { 1.0f, 1.0f, 1.0f} }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Features [](start = 46, length = 8)
You don't seem to need Features
.
{ | ||
public float CustomValue { get; set; } | ||
public int ImageId { get; set; } | ||
public float[] Features { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Features [](start = 27, length = 8)
Same here, I think you can drop it.
// Parameter name: Schema | ||
|
||
// And we can write a few columns out to see that the rest of the data is still available. | ||
var rowEnumerable = mlContext.Data.CreateEnumerable<SampleInfertDataTransformed>(transformedData, reuseRowObject: false); | ||
var rowEnumerable = mlContext.Data.CreateEnumerable<TransformedData>(transformedData, reuseRowObject: false); | ||
Console.WriteLine($"The columns we didn't drop are still available."); | ||
foreach (var row in rowEnumerable) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ [](start = 12, length = 1)
Please remove the brace, not needed for a one line loop. #Resolved
|
||
// And finally, we can write out the rows of the dataset, looking at the columns of interest. | ||
Console.WriteLine($"Label, Parity, and CustomValue columns obtained post-transformation."); | ||
Console.WriteLine($"Label and ImageId columns obtained post-transformation."); | ||
foreach (var row in rowEnumerable) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ [](start = 12, length = 1)
Please remove the braces as we don't need them for a one line loop. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #3268 +/- ##
==========================================
- Coverage 72.62% 72.62% -0.01%
==========================================
Files 807 807
Lines 145080 145080
Branches 16213 16213
==========================================
- Hits 105367 105361 -6
- Misses 35296 35302 +6
Partials 4417 4417
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
// CopyColumns is commonly used to rename columns. | ||
// For example, if you want to train towards Age, and your learner expects a "Label" column, you can | ||
// use CopyColumns to rename Age to Label. Technically, the Age columns still exists, but it won't be | ||
// For example, if you want to train towards ImageId, and your learner expects a "Label" column, you can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
learner [](start = 75, length = 7)
Should this be trainer? #Resolved
// Case: 1 Induced: 2 Parity: 6 | ||
// Case: 1 Induced: 2 Parity: 4 | ||
// Case: 1 Induced: 1 Parity: 3 | ||
// Age: 21 Geneder: Male Education: BS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gender is mispelled #Resolved
}; | ||
|
||
// Convert training data to IDataView. | ||
var dataview = mlContext.Data.LoadFromEnumerable(samples); | ||
|
||
// Select a subset of columns to keep. | ||
var pipeline = mlContext.Transforms.SelectColumns("Age", "Education"); | ||
|
||
// Now we can transform the data and look at the output to confirm the behavior of CopyColumns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
behavior of SelectColumns #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to #1209.