Skip to content

Updating the CopyColumnsEstimator and Transform to use common code (#706) #1141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 4, 2018

Conversation

singlis
Copy link
Member

@singlis singlis commented Oct 3, 2018

This builds on the Estimator conversion for the CopyColumnsTransform.
This change is mainly refactoring as common code has moved to base level classes.

This change is the following:
- CopyColumnTransform now derives from OneToOneTransformerBase
- CopyColumnEstimator now derives from TrivialEstimator
- CopyColumnTransform::Mapper now derives from MapperBase
- Removed code that was no longer needed due to these changes

  (dotnet#706)
This builds on the Estimator conversion for the CopyColumnsTransform.
This change is mainly refactoring as common code has moved to base level classes.

This change is the following:
       - CopyColumnTransform now derives from OneToOneTransformerBase
       - CopyColumnEstimator now derives from TrivialEstimator
       - CopyColumnTransform::Mapper now derives from MapperBase
       - Removed code that was no longer needed due to these changes
@dnfclas
Copy link

dnfclas commented Oct 3, 2018

CLA assistant check
All CLA requirements met.

@singlis singlis self-assigned this Oct 3, 2018
@singlis singlis requested a review from Ivanidzo4ka October 3, 2018 21:37
[assembly: LoadableClass(CopyColumnsTransform.Summary, typeof(CopyColumnsRowMapper), null, typeof(SignatureLoadRowMapper),
CopyColumnsTransform.UserName, CopyColumnsRowMapper.LoaderSignature)]
[assembly: LoadableClass(typeof(IRowMapper), typeof(CopyColumnsTransform), null, typeof(SignatureLoadRowMapper),
CopyColumnsTransform.UserName, CopyColumnsTransform.LoaderSignature)]

namespace Microsoft.ML.Runtime.Data
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime.Data [](start = 23, length = 12)

can you change it to Microsoft.ML.Transforms? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace also reflects the folder location. If we change the namespace, should this also move into Microsoft.ML.Transforms?


In reply to: 222523600 [](ancestors = 222523600)

Copy link
Member Author

@singlis singlis Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I see some Transforms in Microsoft.ML.Transform that have the namespace Microsoft.ML.Runtime.Data - so I am confused on this.


In reply to: 222767195 [](ancestors = 222767195,222523600)

/// <summary>
/// Factory method to create from arguments
/// </summary>
public static IDataTransform Create(IHostEnvironment env, Arguments args, IDataView input)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public [](start = 8, length = 6)

Why it become public? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remained public, please see the code in master:

public static IDataTransform Create(IHostEnvironment env, Arguments args, IDataView input)

Its the diff that makes it look like it transitioned from private to public.


In reply to: 222523784 [](ancestors = 222523784)

}

// Factory method for SignatureLoadModel.
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Factory method for SignatureLoadModel. [](start = 7, length = 42)

I would prefer this format, in all other places we have this pattern instead of //summary #Closed

// string: input column name
ctx.Writer.Write(_columns.Length);
foreach (var column in _columns)
// string: output (Name) column name
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// string: output (Name) column name [](start = 12, length = 38)

I think we have SaveBaseColumns or something like that in base class. #Closed

var type = input.Schema.GetColumnType(colIndex);
result[i] = Utils.MarshalInvoke(MakeGetter<int>, type.RawType, input, colIndex);
return Utils.MarshalInvoke(MakeGetter<int>, type.RawType, input, colIndex);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 73, length = 2)

Ctrl +K+D please #Closed

@Ivanidzo4ka Ivanidzo4ka requested review from sfilipi and Zruty0 October 4, 2018 02:51
{
private readonly (string Source, string Name)[] _columns;
private readonly IHost _host;
public const string LoaderSignature = "CopyTransform";
private const string RegistrationName = "CopyColumns";
public const string Summary = "Copy a source column to a new column.";
public const string UserName = "Copy Columns Transform";
public const string ShortName = "Copy";
Copy link
Member

@sfilipi sfilipi Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the public members be made internal? if they are public, they will display in the documentation site. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are now internal


In reply to: 222532040 [](ancestors = 222532040)

var length = ctx.Reader.ReadInt32();
var columns = new (string Source, string Name)[length];
for (int i = 0; i < length; i++)
public Mapper(CopyColumnsTransform parent, ISchema inputSchema, (string Source, string Name)[] columns)
Copy link
Member

@sfilipi sfilipi Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public [](start = 12, length = 6)

would internal suffice? #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point I'm curious how many wrong comments can I leave here. :)
Technically it's a private class, so no one would see it any way....


In reply to: 222532234 [](ancestors = 222532234)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapper is a private class and a subclass to CopyColumnsTransform. I went ahead and make the constructor internal. @[email protected] let me know if you have any additional feedback.


In reply to: 222533129 [](ancestors = 222533129,222532234)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any other feedback. So feel free to push your changes into PR, if that's the reason why you haven't done it yet :)


In reply to: 222782703 [](ancestors = 222782703,222533129,222532234)

ctx.CheckAtModel();
ctx.SetVersionInfo(GetVersionInfo());
private Delegate MakeGetter<T>(IRow input, int iinfo)
=> input.GetGetter<T>(iinfo);
Copy link
Member

@sfilipi sfilipi Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare the delegate inside the MakeGetter above, if it is the only thing that calls it. #Resolved

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shauheen shauheen added this to the 1018 milestone Oct 4, 2018
 updated namespace usage and entrypoints due to this change.
 - Save now uses the SaveColumns from the base class
 - Other various changes based upon feedback.
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@singlis singlis merged commit ce7d3f6 into dotnet:master Oct 4, 2018
@singlis singlis deleted the singlis/CopyColumns branch March 7, 2019 00:54
@ghost ghost locked as resolved and limited conversation to collaborators Mar 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants