-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Updating the CopyColumnsEstimator and Transform to use common code (#706) #1141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(dotnet#706) This builds on the Estimator conversion for the CopyColumnsTransform. This change is mainly refactoring as common code has moved to base level classes. This change is the following: - CopyColumnTransform now derives from OneToOneTransformerBase - CopyColumnEstimator now derives from TrivialEstimator - CopyColumnTransform::Mapper now derives from MapperBase - Removed code that was no longer needed due to these changes
[assembly: LoadableClass(CopyColumnsTransform.Summary, typeof(CopyColumnsRowMapper), null, typeof(SignatureLoadRowMapper), | ||
CopyColumnsTransform.UserName, CopyColumnsRowMapper.LoaderSignature)] | ||
[assembly: LoadableClass(typeof(IRowMapper), typeof(CopyColumnsTransform), null, typeof(SignatureLoadRowMapper), | ||
CopyColumnsTransform.UserName, CopyColumnsTransform.LoaderSignature)] | ||
|
||
namespace Microsoft.ML.Runtime.Data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime.Data [](start = 23, length = 12)
can you change it to Microsoft.ML.Transforms? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The namespace also reflects the folder location. If we change the namespace, should this also move into Microsoft.ML.Transforms?
In reply to: 222523600 [](ancestors = 222523600)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I see some Transforms in Microsoft.ML.Transform that have the namespace Microsoft.ML.Runtime.Data - so I am confused on this.
In reply to: 222767195 [](ancestors = 222767195,222523600)
/// <summary> | ||
/// Factory method to create from arguments | ||
/// </summary> | ||
public static IDataTransform Create(IHostEnvironment env, Arguments args, IDataView input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public [](start = 8, length = 6)
Why it become public? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This remained public, please see the code in master:
public static IDataTransform Create(IHostEnvironment env, Arguments args, IDataView input) |
Its the diff that makes it look like it transitioned from private to public.
In reply to: 222523784 [](ancestors = 222523784)
} | ||
|
||
// Factory method for SignatureLoadModel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Factory method for SignatureLoadModel. [](start = 7, length = 42)
I would prefer this format, in all other places we have this pattern instead of //summary #Closed
// string: input column name | ||
ctx.Writer.Write(_columns.Length); | ||
foreach (var column in _columns) | ||
// string: output (Name) column name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// string: output (Name) column name [](start = 12, length = 38)
I think we have SaveBaseColumns or something like that in base class. #Closed
var type = input.Schema.GetColumnType(colIndex); | ||
result[i] = Utils.MarshalInvoke(MakeGetter<int>, type.RawType, input, colIndex); | ||
return Utils.MarshalInvoke(MakeGetter<int>, type.RawType, input, colIndex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[](start = 73, length = 2)
Ctrl +K+D please #Closed
{ | ||
private readonly (string Source, string Name)[] _columns; | ||
private readonly IHost _host; | ||
public const string LoaderSignature = "CopyTransform"; | ||
private const string RegistrationName = "CopyColumns"; | ||
public const string Summary = "Copy a source column to a new column."; | ||
public const string UserName = "Copy Columns Transform"; | ||
public const string ShortName = "Copy"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the public members be made internal? if they are public, they will display in the documentation site. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var length = ctx.Reader.ReadInt32(); | ||
var columns = new (string Source, string Name)[length]; | ||
for (int i = 0; i < length; i++) | ||
public Mapper(CopyColumnsTransform parent, ISchema inputSchema, (string Source, string Name)[] columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public [](start = 12, length = 6)
would internal suffice? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at this point I'm curious how many wrong comments can I leave here. :)
Technically it's a private class, so no one would see it any way....
In reply to: 222532234 [](ancestors = 222532234)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapper is a private class and a subclass to CopyColumnsTransform. I went ahead and make the constructor internal. @[email protected] let me know if you have any additional feedback.
In reply to: 222533129 [](ancestors = 222533129,222532234)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any other feedback. So feel free to push your changes into PR, if that's the reason why you haven't done it yet :)
In reply to: 222782703 [](ancestors = 222782703,222533129,222532234)
ctx.CheckAtModel(); | ||
ctx.SetVersionInfo(GetVersionInfo()); | ||
private Delegate MakeGetter<T>(IRow input, int iinfo) | ||
=> input.GetGetter<T>(iinfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
declare the delegate inside the MakeGetter above, if it is the only thing that calls it. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated namespace usage and entrypoints due to this change. - Save now uses the SaveColumns from the base class - Other various changes based upon feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This builds on the Estimator conversion for the CopyColumnsTransform.
This change is mainly refactoring as common code has moved to base level classes.
This change is the following:
- CopyColumnTransform now derives from OneToOneTransformerBase
- CopyColumnEstimator now derives from TrivialEstimator
- CopyColumnTransform::Mapper now derives from MapperBase
- Removed code that was no longer needed due to these changes