Skip to content

Updated CopyColumns, DropColumns and SelectColumns samples. #3268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 10, 2019

Conversation

zeahmed
Copy link
Contributor

@zeahmed zeahmed commented Apr 9, 2019

Related to #1209.

// Create a small dataset as an IEnumerable.
var samples = new List<InputData>()
{
new InputData(){ ImageId = 1, Features = new [] { 1.0f, 1.0f, 1.0f} },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Features [](start = 46, length = 8)

You don't seem to need Features.

{
public float CustomValue { get; set; }
public int ImageId { get; set; }
public float[] Features { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Features [](start = 27, length = 8)

Same here, I think you can drop it.

// Parameter name: Schema

// And we can write a few columns out to see that the rest of the data is still available.
var rowEnumerable = mlContext.Data.CreateEnumerable<SampleInfertDataTransformed>(transformedData, reuseRowObject: false);
var rowEnumerable = mlContext.Data.CreateEnumerable<TransformedData>(transformedData, reuseRowObject: false);
Console.WriteLine($"The columns we didn't drop are still available.");
foreach (var row in rowEnumerable)
{
Copy link
Contributor

@artidoro artidoro Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ [](start = 12, length = 1)

Please remove the brace, not needed for a one line loop. #Resolved


// And finally, we can write out the rows of the dataset, looking at the columns of interest.
Console.WriteLine($"Label, Parity, and CustomValue columns obtained post-transformation.");
Console.WriteLine($"Label and ImageId columns obtained post-transformation.");
foreach (var row in rowEnumerable)
{
Copy link
Contributor

@artidoro artidoro Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ [](start = 12, length = 1)

Please remove the braces as we don't need them for a one line loop. #Resolved

@artidoro
Copy link
Contributor

artidoro commented Apr 10, 2019

        {

Please remove the braces as we don't need them for a one line loop. #Resolved


Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/SelectColumns.cs:49 in 82859ca. [](commit_id = 82859ca, deletion_comment = False)

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@codecov
Copy link

codecov bot commented Apr 10, 2019

Codecov Report

Merging #3268 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3268      +/-   ##
==========================================
- Coverage   72.62%   72.62%   -0.01%     
==========================================
  Files         807      807              
  Lines      145080   145080              
  Branches    16213    16213              
==========================================
- Hits       105367   105361       -6     
- Misses      35296    35302       +6     
  Partials     4417     4417
Flag Coverage Δ
#Debug 72.62% <ø> (-0.01%) ⬇️
#production 68.17% <ø> (-0.01%) ⬇️
#test 88.92% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0%> (-20.52%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 84.7% <0%> (-0.21%) ⬇️
...rc/Microsoft.ML.Transforms/CustomMappingCatalog.cs 100% <0%> (ø) ⬆️
src/Microsoft.ML.Transforms/ExtensionsCatalog.cs 57.14% <0%> (ø) ⬆️
src/Microsoft.ML.Maml/MAML.cs 26.21% <0%> (+1.45%) ⬆️

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:


// CopyColumns is commonly used to rename columns.
// For example, if you want to train towards Age, and your learner expects a "Label" column, you can
// use CopyColumns to rename Age to Label. Technically, the Age columns still exists, but it won't be
// For example, if you want to train towards ImageId, and your learner expects a "Label" column, you can
Copy link
Member

@singlis singlis Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

learner [](start = 75, length = 7)

Should this be trainer? #Resolved

// Case: 1 Induced: 2 Parity: 6
// Case: 1 Induced: 2 Parity: 4
// Case: 1 Induced: 1 Parity: 3
// Age: 21 Geneder: Male Education: BS
Copy link
Member

@singlis singlis Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gender is mispelled #Resolved

};

// Convert training data to IDataView.
var dataview = mlContext.Data.LoadFromEnumerable(samples);

// Select a subset of columns to keep.
var pipeline = mlContext.Transforms.SelectColumns("Age", "Education");

// Now we can transform the data and look at the output to confirm the behavior of CopyColumns.
Copy link
Member

@singlis singlis Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

behavior of SelectColumns #Resolved

Copy link
Member

@singlis singlis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@zeahmed zeahmed merged commit 947b3f8 into dotnet:master Apr 10, 2019
zeahmed added a commit to zeahmed/machinelearning that referenced this pull request Apr 11, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants