Skip to content

Conversation

zeahmed
Copy link
Contributor

@zeahmed zeahmed commented May 7, 2018

Fixing issue #31 in this PR.

/// <summary>
/// LearningPipeline class is used to define the steps needed to perform desired machine learning task.<para/>
/// The steps are defined by adding a data loader (e.g. <see cref="TextLoader"/>) followed by zero or more transforms (e.g. <see cref="Microsoft.ML.Transforms.TextFeaturizer"/>)
/// and atmost one trainer/learner (e.g. <see cref="Microsoft.ML.Trainers.FastTreeBinaryClassifier"/>) in the pipeline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type-o atmost should be 2 words.

/// The steps are defined by adding a data loader (e.g. <see cref="TextLoader"/>) followed by zero or more transforms (e.g. <see cref="Microsoft.ML.Transforms.TextFeaturizer"/>)
/// and atmost one trainer/learner (e.g. <see cref="Microsoft.ML.Trainers.FastTreeBinaryClassifier"/>) in the pipeline.
///
/// Data can be analyzed at every step by inspecting the LearningPipeline object in VS.Net debugger.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this line provides much value. Do you think we need it?

/// and atmost one trainer/learner (e.g. <see cref="Microsoft.ML.Trainers.FastTreeBinaryClassifier"/>) in the pipeline.
///
/// Data can be analyzed at every step by inspecting the LearningPipeline object in VS.Net debugger.
/// <example>
Copy link
Member

@eerhardt eerhardt May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/xmldoc/example it shows that <example> is not nested inside the <summary> section. Instead, it is its own top-level section. #Resolved

/// </summary>
/// <typeparam name="TInput">Type of data instances the model will be trained on. It's a custom type defined by the user according to the structure of data.
/// <para/>
/// E.g. please see "Microsoft.ML.Scenarios.ScenarioTests.SentimentData" in "Microsoft.ML.Tests.csproj" for input type definition for sentiment classification task.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think typical users will have access to the repo. So putting a reference like this in the XML ref docs wouldn't help that much. Maybe a FWLink to the 'getting started' doc instead?

/// <summary>
/// Check if a specific loader/transform/trainer is in the pipeline?
/// </summary>
/// <param name="item">Any ML component (data loader, transform or trainer) defined as ILearningPipelineItem.</param>
Copy link
Contributor

@TomFinley TomFinley May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ILearningPipelineItem [](start = 95, length = 21)

When having type names in documentation, it may be good practice for us to use the <see tags, so this would become <see cref="ILearningPipelineItem" />. The reason is: the XML docs will be generated with links, and attempts to, say, rename the type in VS will track these usages as well. #Closed

/// Copy the pipeline items into an array.
/// </summary>
/// <param name="array">Array the items are copied to.</param>
/// <param name="arrayIndex">Index to start copying from.</param>
Copy link
Contributor

@TomFinley TomFinley May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index to start copying from [](start = 37, length = 27)

The phrase "Index to start copying from", the "from" suggests that this is an index on the source, but rather, it is an index on the destination.

Perhaps the word "into" vs. "from" would be more clear... plus an explicit paramref, just to make things absolutely crystal clear. So something like maybe, Index of <paramref name="array" /> we start copying to. I imagine it could be wordsmithed better. :) #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I typically copy the MSDN docs for comments like this. See ICollection.CopyTo

array
Type: System.Array
The one-dimensional Array that is the destination of the elements copied from ICollection. The Array must have zero-based indexing.

index
Type: System.Int32
The zero-based index in array at which copying begins.



/// <summary>
/// LearningPipeline class is used to define the steps needed to perform desired machine learning task.<para/>
Copy link
Contributor

@TomFinley TomFinley May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perform desired [](start = 69, length = 15)

"perform desired" => "perform a desired"? #Closed



/// <summary>
/// LearningPipeline class is used to define the steps needed to perform desired machine learning task.<para/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LearningPipeline [](start = 8, length = 16)

Class names should appear in <see tags`. However in this case, given that this is describing this class, I might just replace "LearningPipeline class is used" with "This class is used..."

private List<ILearningPipelineItem> Items { get; } = new List<ILearningPipelineItem>();

/// <summary>
/// Construct an empty LearningPipeline object.
Copy link
Contributor

@TomFinley TomFinley May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LearningPipeline [](start = 31, length = 16)

Generally use <see here. #Closed

@TomFinley
Copy link
Contributor

TomFinley commented May 7, 2018

Thank you very much for writing code on this important class @zeahmed ! #Closed

/// Check if a specific loader/transform/trainer is in the pipeline?
/// </summary>
/// <param name="item">Any ML component (data loader, transform or trainer) defined as ILearningPipelineItem.</param>
/// <returns>true/false</returns>
Copy link
Contributor

@TomFinley TomFinley May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns>true/false</returns [](start = 13, length = 27)

That the return value is either true or false is implied by the return value being bool. Can this be improved somehow? It seems like any "improvement" would just involve restating what what already written in the description, so is it acceptable in situations like this to just not include a <returns> tag? #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I typically copy the MSDN docs: https://msdn.microsoft.com/en-us/library/k5cf1d56(v=vs.110).aspx

Return Value
Type: System.Boolean
true if item is found in the ICollection; otherwise, false.

public void Add(ILearningPipelineItem item) => Items.Add(item);

/// <summary>
/// Remove all the transforms/trainers from the pipeline.
Copy link
Contributor

@TomFinley TomFinley May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transforms/trainers [](start = 27, length = 19)

Not loaders? It seems like this would also clear any loaders. #Closed

@zeahmed
Copy link
Contributor Author

zeahmed commented May 8, 2018

Thanks for the feedback! I updated the PR.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks, @zeahmed!



/// <summary>
/// <see cref="LearningPipeline"/> class is used to define the steps needed to perform a desired machine learning task.<para/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class [](start = 8, length = 36)

I'd really prefer "This class", but if you really want the actual name of the class referenced, for grammar reasons maybe prefix with "The" so that it's The LearningPipeline class" vs. "LearningPipeline class."

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@eerhardt eerhardt merged commit 3e5b5ed into dotnet:master May 9, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants