Skip to content

Image transforms become Estimators #753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Aug 30, 2018

Conversation

Zruty0
Copy link
Contributor

@Zruty0 Zruty0 commented Aug 27, 2018

Converted the following transforms to Estimators:

  • ImageLoader
  • ImageResizer
  • ImagePixelExtractor
  • ImageGrayscale

Fixes #707

@Zruty0 Zruty0 added the API Issues pertaining the friendly API label Aug 27, 2018
namespace Microsoft.ML.Runtime.ImageAnalytics
{
// REVIEW: Rewrite as LambdaTransform to simplify.
public abstract class TrivialEstimator<TTransformer> : IEstimator<TTransformer>
Copy link
Contributor Author

@Zruty0 Zruty0 Aug 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TrivialEstimator [](start = 26, length = 16)

Move to a separate file #Closed

@Zruty0 Zruty0 mentioned this pull request Aug 27, 2018
@@ -30,18 +30,18 @@ public enum VectorKind

public readonly string Name;
public readonly VectorKind Kind;
public readonly DataKind ItemKind;
public readonly ColumnType ItemType;
Copy link
Contributor

@TomFinley TomFinley Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ColumnType ItemType [](start = 28, length = 19)

This is somewhat problematic for key columns. How do we represent the two distinct concepts of a key-columns of known and unknown counts? #Pending

Copy link
Contributor Author

@Zruty0 Zruty0 Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, ItemType is never supposed to be a key. For example, for a scalar key column, the correct representation is Kind = Scalar, ItemType = PrimitiveType.U4, IsKey = true.

I think I should enforce it in constructor even.

I don't think we should even have key types of unknown counts: it is already causing issues for Term transform, and I'm yet to see any benefit of this type.


In reply to: 213151407 [](ancestors = 213151407)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that keys of unknown size should go away, but are we going to do this in this PR?

That we're representing a key-type using a column type which is not a key-type at all, and whose only connection to it is that their DataKind happen to be the same, is rather odd and unfortunate. I hope we can imagine a better way here, though I'm not sure I see one right off the bat.


In reply to: 213156549 [](ancestors = 213156549,213151407)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also representing a vector type using a column type which is not a vector-type at all, and whose only connection to it is that their ItemType happen to be the same :)


In reply to: 214125274 [](ancestors = 214125274,213156549,213151407)

Pete Luferenko added 2 commits August 27, 2018 17:54
@shauheen shauheen added this to the 0818 milestone Aug 28, 2018
/// <summary>
/// Transform which takes one or many columns of <see cref="ImageType"/> and convert them into vector representation.
/// </summary>
public sealed class ImagePixelExtractorTransform : OneToOneTransformBase
public sealed class ImagePixelExtractorTransform : OneToOneTransformerBase, ICanSaveModel
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICanSaveModel [](start = 80, length = 13)

not necessary #Resolved

public ColumnInfo(string input, string output, ColorBits colors = ColorBits.Rgb, bool interleave = false, float scale = 1f, float offset = 0f)
: this(input, output, colors, interleave, true, scale, offset)
{
}
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need this constructor? #ByDesign

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for Convert=true case. So yes


In reply to: 213860876 [](ancestors = 213860876)

}
}

private void CheckInput(ISchema inputSchema, int col, out int srcCol)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private [](start = 8, length = 7)

make this public and call it in GetOutputSchema in each estimator. #ByDesign

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot do this, because GetOutputSchema in the estimator operates over SchemaShape


In reply to: 213863333 [](ancestors = 213863333)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if only ISchema and SchemaShape were relatives....


In reply to: 213863552 [](ancestors = 213863552,213863333)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@TomFinley
Copy link
Contributor

TomFinley commented Aug 30, 2018

        public enum VectorKind

I like how the default value for VectorKind is "it's not a vector at all." :) #ByDesign


Refers to: src/Microsoft.ML.Core/Data/IEstimator.cs:24 in 3e845fd. [](commit_id = 3e845fd, deletion_comment = False)

@Zruty0
Copy link
Contributor Author

Zruty0 commented Aug 30, 2018

        public enum VectorKind

Heh


In reply to: 417410335 [](ancestors = 417410335)


Refers to: src/Microsoft.ML.Core/Data/IEstimator.cs:24 in 3e845fd. [](commit_id = 3e845fd, deletion_comment = False)

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@Zruty0 Zruty0 merged commit 5133797 into dotnet:master Aug 30, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants