Skip to content

Scrubbing image transforms #2875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 11, 2019
Merged

Scrubbing image transforms #2875

merged 5 commits into from
Mar 11, 2019

Conversation

najeeb-kazmi
Copy link
Member

Fixes #2833

@codecov
Copy link

codecov bot commented Mar 7, 2019

Codecov Report

Merging #2875 into master will increase coverage by <.01%.
The diff coverage is 77.71%.

@@            Coverage Diff             @@
##           master    #2875      +/-   ##
==========================================
+ Coverage   71.81%   71.81%   +<.01%     
==========================================
  Files         812      812              
  Lines      142644   142658      +14     
  Branches    16090    16095       +5     
==========================================
+ Hits       102433   102453      +20     
+ Misses      35828    35824       -4     
+ Partials     4383     4381       -2
Flag Coverage Δ
#Debug 71.81% <77.71%> (ø) ⬆️
#production 67.96% <75.9%> (ø) ⬆️
#test 86.25% <94.44%> (+0.01%) ⬆️
Impacted Files Coverage Δ
...st/Microsoft.ML.Tests/Scenarios/TensorflowTests.cs 100% <100%> (ø) ⬆️
src/Microsoft.ML.ImageAnalytics/ImageGrayscale.cs 78.33% <100%> (ø) ⬆️
test/Microsoft.ML.Tests/ImagesTests.cs 98.81% <100%> (ø) ⬆️
src/Microsoft.ML.ImageAnalytics/ImageLoader.cs 84.55% <100%> (ø) ⬆️
src/Microsoft.ML.ImageAnalytics/ImageResizer.cs 84.64% <100%> (ø) ⬆️
...c/Microsoft.ML.ImageAnalytics/ExtensionsCatalog.cs 37.5% <28.57%> (ø) ⬆️
...cenariosWithDirectInstantiation/TensorflowTests.cs 92.51% <66.66%> (ø) ⬆️
...Microsoft.ML.ImageAnalytics/ImagePixelExtractor.cs 82.55% <71.42%> (ø) ⬆️
...rosoft.ML.ImageAnalytics/VectorToImageTransform.cs 72.23% <76.36%> (ø) ⬆️
src/Microsoft.ML.Data/TrainCatalog.cs 84.11% <0%> (-1.72%) ⬇️
... and 80 more

@@ -11,15 +11,15 @@ public static class ImageEstimatorsCatalog
{
/// <include file='doc.xml' path='doc/members/member[@name="ImageGrayscalingEstimator"]/*' />
/// <param name="catalog">The transform's catalog.</param>
/// <param name="columnPairs">The name of the columns containing the name of the resulting output column (first item of the tuple), and the paths of the images to work on (second item of the tuple).</param>
/// <param name="outputInputColumnPair">Names of the resulting output column (first item of the tuple), and the column containing the paths of the images to work on (second item of the tuple).</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputInputColumnPair [](start = 25, length = 21)

they are not column pairs anymore. This might be leftover from when we wer eusing tuples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfilipi I see output-input tuples being used in the samples for several image transforms, despite the api signature accepting params ColumnOptions[].

For example, in the ConvertToGrayScale.cs sample, we have tuples being passed to both LoadImages and ConvertToGrayscale.

            var pipeline = mlContext.Transforms.LoadImages(imagesFolder, ("ImageObject", "ImagePath"))
                           .Append(mlContext.Transforms.ConvertToGrayscale(("Grayscale", "ImageObject")));

Questions for you:

  1. Are the tuples from the samples being cast to array or are the samples incorrect and need to be revised?
  2. What name do you suggest for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not Senja, but maybe I can be useful as well.

  1. ColumnOptions have implicit cast from tuples, so user can enter tuples if he wants to, but in general we work with array of ColumnsOptions. So examples are fine, at least for now.
  2. /// <param name="columns">Specifies the names of the input columns for the transformation, and their respective output column names.</param> I think this is our default choice for such cases.

In reply to: 263581917 [](ancestors = 263581917)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed as suggested

@@ -11,15 +11,15 @@ public static class ImageEstimatorsCatalog
{
/// <include file='doc.xml' path='doc/members/member[@name="ImageGrayscalingEstimator"]/*' />
/// <param name="catalog">The transform's catalog.</param>
/// <param name="columnPairs">The name of the columns containing the name of the resulting output column (first item of the tuple), and the paths of the images to work on (second item of the tuple).</param>
/// <param name="outputInputColumnPair">Names of the resulting output column (first item of the tuple), and the column containing the paths of the images to work on (second item of the tuple).</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tuple [](start = 104, length = 5)

no more tuple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfilipi see above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed as suggested above

@@ -82,7 +82,7 @@ internal sealed class Options : TransformInputBase
/// <summary>
/// The columns passed to this <see cref="ITransformer"/>.
/// </summary>
public IReadOnlyCollection<(string outputColumnName, string inputColumnName)> Columns => ColumnPairs.AsReadOnly();
internal IReadOnlyCollection<(string outputColumnName, string inputColumnName)> Columns => ColumnPairs.AsReadOnly();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal [](start = 8, length = 8)

why make them internal? the user might want to inspect them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is specifically called out in the issue #2827 (item 5 in the list). @Ivanidzo4ka can you elaborate?

@@ -262,7 +262,7 @@ public void TestBackAndForthConversionWithoutAlphaInterleave()
var pixels = new ImagePixelExtractingTransformer(env, "ImagePixels", "ImageCropped", interleave: true, scale: 2f / 19, offset: 30).Transform(cropped);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImagePixelExtractingTransformer [](start = 29, length = 31)

Right now ImagePixel and VectorToImage are pair of opposite to each over transforms. it's weird to have in one of them interleave parameter and interleavedColors in another.

I know both this constructors are internal, but it's still looks weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the public constructors and public api, I intentionally named the parameter differently because the two transforms are doing the opposite of each other. In ImagePixel, the parameter specifies whether to interleave the colors or not when producing the vector, so I named it interleavePixelColors. In VectorToImage, the parameter specifies whether the colors in the vector are interleaved or not, so I named it interleavedColors. The intention was to make the parameter names descriptive, like all the other parameters that I renamed.

Do you agree with this distinction in names, i.e. interleavePixelColors for ImagePixel and interleavedColors for VectorToImage? If so, I can rename the parameters in the internal constructors to be consistent with the public ones. If not, do you suggest I use the same name for both ImagePixel and VectorToImage and what should that name be? In either case, I agree the names in public and internal constructors should be consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it fine to make the rest of the parameters descriptive as I have done or do you prefer the previous short names e.g. offset (old) vs offsetPixelColor (new) ?

@Ivanidzo4ka

@@ -336,7 +336,7 @@ protected override Delegate MakeGetter(DataViewRow input, int iinfo, Func<int, b

var type = _types[iinfo];
var ex = _parent._columns[iinfo];
bool needScale = ex.Offset != 0 || ex.Scale != 1;
bool needScale = ex.OffsetPixelColor != 0 || ex.ScalePixelColor != 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OffsetPixelColor [](start = 36, length = 16)

is this correct? the offset are for the image, not for the pixel color? (one can argue that the pixel colors are the image, but i think it becomes too complicated. )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Changing to ScaleImage and OffsetImage.

/// <param name="defaultGreen">Default value for grenn color, would be overriden if <paramref name="colors"/> contains <see cref="ImagePixelExtractingEstimator.ColorBits.Green"/>.</param>
/// <param name="defaultBlue">Default value for blue color, would be overriden if <paramref name="colors"/> contains <see cref="ImagePixelExtractingEstimator.ColorBits.Blue"/>.</param>
/// <param name="scalePixelColor">Scale color pixel value by this amount.</param>
/// <param name="offsetPixelCOlor">Offset color pixel value by this amount.</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a note below on whether those are correct. Offseting and scaling pixel color makes little sense to me, they make more sense in the image context.

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@najeeb-kazmi najeeb-kazmi merged commit 24f782d into dotnet:master Mar 11, 2019
@najeeb-kazmi najeeb-kazmi deleted the 2833 branch January 30, 2020 01:19
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants