-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Microsoft.ML.Transforms assembly lockdown #2648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -9,7 +9,7 @@ | |||
namespace Microsoft.ML | |||
{ | |||
/// <summary> | |||
/// Static extensions for categorical transforms. | |||
/// The catalog of categorical transformations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformations [](start = 35, length = 15)
transformers? #Pending
Would be nice to have image similar to image from #2511 |
@@ -7,6 +7,9 @@ | |||
|
|||
namespace Microsoft.ML | |||
{ | |||
/// <summary> | |||
/// The catalog of projection transformations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformations [](start = 34, length = 15)
transformers? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some troublesome language usage that we need to think about how to do. We are talking about transforming usage. And the way we do that is via ITransformer
, but what is actually being returned through these catalogs? IEstimator
! I don't object to this preference for transformer, per se, but I would not insist on it if it is unnatural in context. The context here is how we describe the process by which we transform data, which includes ITransformer
, but is not specifically related to it (it can't be, what is returned here is not ITransformer
implementors at all!). #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I don't disagree with your suggestion, but I wouldn't insist on it, and I don't know that I see a strong reason for it, since the term usage seems appropriate in the (deliberately) vague context of this catalog. #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the .Transforms
section of MLContext
. I preferred not to use the contraction "transforms" and use the full word "transformations". My choice is more open ended and not reference ML.NET objects.
If we want to be more specific, since these are returning estimators, we should probably use that word instead.
So in conclusion, if we want to be specific I would use "estimators" alternatively I would stick to "transformations". I would personally keep the word "transformations". #Pending
Sounds good I will post that. #Resolved |
@@ -12,6 +12,9 @@ namespace Microsoft.ML | |||
using CharTokenizingDefaults = TokenizingByCharactersEstimator.Defaults; | |||
using TextNormalizeDefaults = TextNormalizingEstimator.Defaults; | |||
|
|||
/// <summary> | |||
/// The catalog of text related transformations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformations [](start = 36, length = 15)
transformers? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments that doesn't require blocking but please address them. Thanks.
Codecov Report
@@ Coverage Diff @@
## master #2648 +/- ##
==========================================
- Coverage 71.54% 71.53% -0.01%
==========================================
Files 800 800
Lines 141847 141847
Branches 16119 16119
==========================================
- Hits 101480 101476 -4
- Misses 35916 35921 +5
+ Partials 4451 4450 -1
|
@@ -117,7 +119,7 @@ void ICanSaveModel.Save(ModelSaveContext ctx) | |||
ctx.Writer.Write(_gamma); | |||
} | |||
|
|||
public float Next(Random rand) | |||
float IFourierDistributionSampler.Next(Random rand) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IFourierDistributionSampler [](start = 14, length = 27)
I don't consider a regular implementation harmful in this case, since this is clearly key functionality on this object. Indeed I'm not sure this thing needs to be an interface at all. You're improving the code a bunch, but I think it still needs a bit more work. I opened #2659 to track this. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will revert to a regular implementation here then. Thank you for opening that! #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks artidoro, incidentally would have been fine to leave it, but it's just as good being public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good thank you @artidoro
@Ivanidzo4ka Here is the public surface of the Microsoft.ML.Transforms assembly after my change: T:Microsoft.ML.BinaryClassificationMetricsStatistics #Resolved |
Fixes #2282.
This is a pull request in the assembly public surface lockdown series.
Besides internalization I also added some comments for public members. This is a relatively small PR as we have gone through the Transforms assembly many times already.
I made the method
CheckInputColumn
inOneToOneTransformerBase
BestFriend private protected instead of just protected.