-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Image transforms become Estimators #753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
namespace Microsoft.ML.Runtime.ImageAnalytics | ||
{ | ||
// REVIEW: Rewrite as LambdaTransform to simplify. | ||
public abstract class TrivialEstimator<TTransformer> : IEstimator<TTransformer> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TrivialEstimator [](start = 26, length = 16)
Move to a separate file #Closed
@@ -30,18 +30,18 @@ public enum VectorKind | |||
|
|||
public readonly string Name; | |||
public readonly VectorKind Kind; | |||
public readonly DataKind ItemKind; | |||
public readonly ColumnType ItemType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ColumnType ItemType [](start = 28, length = 19)
This is somewhat problematic for key columns. How do we represent the two distinct concepts of a key-columns of known and unknown counts? #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, ItemType
is never supposed to be a key. For example, for a scalar key column, the correct representation is Kind = Scalar, ItemType = PrimitiveType.U4, IsKey = true
.
I think I should enforce it in constructor even.
I don't think we should even have key types of unknown counts: it is already causing issues for Term
transform, and I'm yet to see any benefit of this type.
In reply to: 213151407 [](ancestors = 213151407)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that keys of unknown size should go away, but are we going to do this in this PR?
That we're representing a key-type using a column type which is not a key-type at all, and whose only connection to it is that their DataKind
happen to be the same, is rather odd and unfortunate. I hope we can imagine a better way here, though I'm not sure I see one right off the bat.
In reply to: 213156549 [](ancestors = 213156549,213151407)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are also representing a vector type using a column type which is not a vector-type at all, and whose only connection to it is that their ItemType
happen to be the same :)
In reply to: 214125274 [](ancestors = 214125274,213156549,213151407)
/// <summary> | ||
/// Transform which takes one or many columns of <see cref="ImageType"/> and convert them into vector representation. | ||
/// </summary> | ||
public sealed class ImagePixelExtractorTransform : OneToOneTransformBase | ||
public sealed class ImagePixelExtractorTransform : OneToOneTransformerBase, ICanSaveModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ICanSaveModel [](start = 80, length = 13)
not necessary #Resolved
public ColumnInfo(string input, string output, ColorBits colors = ColorBits.Rgb, bool interleave = false, float scale = 1f, float offset = 0f) | ||
: this(input, output, colors, interleave, true, scale, offset) | ||
{ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need this constructor? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
} | ||
|
||
private void CheckInput(ISchema inputSchema, int col, out int srcCol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private [](start = 8, length = 7)
make this public and call it in GetOutputSchema in each estimator. #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot do this, because GetOutputSchema in the estimator operates over SchemaShape
In reply to: 213863333 [](ancestors = 213863333)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if only ISchema and SchemaShape were relatives....
In reply to: 213863552 [](ancestors = 213863552,213863333)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how the default value for Refers to: src/Microsoft.ML.Core/Data/IEstimator.cs:24 in 3e845fd. [](commit_id = 3e845fd, deletion_comment = False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted the following transforms to Estimators:
Fixes #707