Skip to content

Make ImageClassification API an ITrainerEstimator and refactor code. #4372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Oct 25, 2019

Conversation

codemzs
Copy link
Member

@codemzs codemzs commented Oct 24, 2019

In addition to converting ImageClassification API to ITrainerEstimator it also fixes #4276

@codemzs codemzs requested a review from a team as a code owner October 24, 2019 10:04
@codecov
Copy link

codecov bot commented Oct 24, 2019

Codecov Report

Merging #4372 into master will increase coverage by <.01%.
The diff coverage is 89.09%.

@@            Coverage Diff             @@
##           master    #4372      +/-   ##
==========================================
+ Coverage   74.63%   74.64%   +<.01%     
==========================================
  Files         883      883              
  Lines      155117   155036      -81     
  Branches    16931    16911      -20     
==========================================
- Hits       115768   115723      -45     
+ Misses      34601    34569      -32     
+ Partials     4748     4744       -4
Flag Coverage Δ
#Debug 74.64% <89.09%> (ø) ⬆️
#production 70.2% <82.85%> (+0.01%) ⬆️
#test 89.56% <100%> (-0.02%) ⬇️
Impacted Files Coverage Δ
src/Microsoft.ML.Dnn/ImageClassificationTrainer.cs 93.12% <ø> (ø)
src/Microsoft.ML.Dnn/DnnUtils.cs 69.92% <100%> (ø) ⬆️
...rosoft.ML.Data/Scorers/PredictedLabelScorerBase.cs 81.71% <100%> (ø) ⬆️
...cenariosWithDirectInstantiation/TensorflowTests.cs 90.87% <100%> (-0.02%) ⬇️
src/Microsoft.ML.ImageAnalytics/ImageLoader.cs 85.78% <100%> (ø) ⬆️
....ML.Data/Scorers/MulticlassClassificationScorer.cs 60.39% <100%> (+0.26%) ⬆️
src/Microsoft.ML.Dnn/DnnCatalog.cs 100% <100%> (ø) ⬆️
...Microsoft.ML.Data/Scorers/PredictionTransformer.cs 91.05% <66.66%> (-1.08%) ⬇️
test/Microsoft.ML.AutoML.Tests/DatasetUtil.cs 79.06% <0%> (-6.98%) ⬇️
...L.AutoML/TrainerExtensions/TrainerExtensionUtil.cs 85.15% <0%> (-1.75%) ⬇️
... and 5 more
#Resolved

PredictedLabelColumnName = predictedLabelColumnName,
ValidationSet = validationSet
};
options.EarlyStoppingCriteria = options.DisableEarlyStopping ? null : options.EarlyStoppingCriteria ?? new ImageClassificationTrainer.EarlyStopping();
Copy link
Member

@eerhardt eerhardt Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a comment on the previous PR that you said you would address in this PR.

#4362 (comment) #Closed

string featureColumnName = DefaultColumnNames.Features,
string scoreColumnName = DefaultColumnNames.Score,
string predictedLabelColumnName = DefaultColumnNames.PredictedLabel,
IDataView validationSet = null)
Copy link
Member

@eerhardt eerhardt Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you were going to move the validationSet to be during the Fit method? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have and it works well when the trainer is the last item in the pipeline but if it is not then the pipeline becomes nasty and you will need to save the model twice. Most people would want to add KeyToValue Transform after this trainer to convert predictedLabel to string type for example .


In reply to: 338737502 [](ancestors = 338737502)

@eerhardt
Copy link
Member

eerhardt commented Oct 24, 2019

public class ImageClassificationTrainer :

public sealed class ? #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:68 in 9a01968. [](commit_id = 9a01968, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 24, 2019

    public delegate void ImageClassificationMetricsCallback(ImageClassificationMetrics metrics);

Do we need a new delegate here? Or can we instead use Action<ImageClassificationMetrics> ? #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:124 in 9a01968. [](commit_id = 9a01968, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 24, 2019

    public sealed class Options : TrainerInputBaseWithLabel

(nit) xml doc comments on the class. #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:336 in 9a01968. [](commit_id = 9a01968, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 24, 2019

        public IDataView ValidationSet;

Same comment here about the ValidationSet being passed into Fit, like the other algorithms in ML.NET. #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:426 in 9a01968. [](commit_id = 9a01968, deletion_comment = False)

@codemzs
Copy link
Member Author

codemzs commented Oct 25, 2019

        public IDataView ValidationSet;

see my previous comment. We have an api for Fit that takes both train and validation set.


In reply to: 546056311 [](ancestors = 546056311)


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:426 in 9a01968. [](commit_id = 9a01968, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 25, 2019

        };

(nit) unnecessary semi-colon. #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:311 in 72e0eee. [](commit_id = 72e0eee, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 25, 2019

            new Span<float>(classifier.ClassProbabilities, 0, _classCount).CopyTo(editor.Values);

I think this should be changed to pass the editor.Values into the classifier.Score method. Then we only need to copy the values a single time - from TensorFlow's native memory directly into the VBuffer.

The way we have it today, we are holding this extra float[] _classProbabilities as an intermediate step, which isn't necessary. #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:1427 in 72e0eee. [](commit_id = 72e0eee, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 25, 2019

        public void Score(VBuffer<byte> image)

(nit) should be in VBuffer<byte> image. #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:1407 in 72e0eee. [](commit_id = 72e0eee, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 25, 2019

public sealed class ImageClassificationModelParameters : ModelParametersBase<VBuffer<float>>, IValueMapper

xml docs on public types. #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:1281 in ea79b6f. [](commit_id = ea79b6f, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Oct 25, 2019

    public DataViewType InputType => _inputType;

Do InputType and OutputType need to be public? #Resolved


Refers to: src/Microsoft.ML.Dnn/ImageClassificationTransform.cs:1328 in ea79b6f. [](commit_id = ea79b6f, deletion_comment = False)

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codemzs codemzs merged commit b1788c3 into dotnet:master Oct 25, 2019
frank-dong-ms-zz pushed a commit to frank-dong-ms-zz/machinelearning that referenced this pull request Nov 4, 2019
…otnet#4372)

* Make ImageClassification API an ITrainerEstimator and refactor code.

* comments.

* comments.

* Fix premature session close.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.

* PR feedback.
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issues with ImageClassificationTransformer
2 participants