Skip to content

XML doc for LpNormalizingEstimator and GlobalContrastNormalizingEstimator #3454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 22, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 72 additions & 14 deletions src/Microsoft.ML.Transforms/GcnTransform.cs
Original file line number Diff line number Diff line change
Expand Up @@ -35,17 +35,7 @@
namespace Microsoft.ML.Transforms
{
/// <summary>
/// Lp-Norm (vector/row-wise) normalization transform. Has the following two set of arguments:
/// 1- Lp-Norm normalizer arguments:
/// Normalize rows individually by rescaling them to unit norm (L2, L1 or LInf).
/// Performs the following operation on a vector X:
/// Y = (X - M) / D, where M is mean and D is either L2 norm, L1 norm or LInf norm.
/// Scaling inputs to unit norms is a common operation for text classification or clustering.
/// 2- Global contrast normalization (GCN) arguments:
/// Performs the following operation on a vector X:
/// Y = (s * X - M) / D, where s is a scale, M is mean and D is either L2 norm or standard deviation.
/// Usage examples and Matlab code:
/// <a href="https://www.cs.stanford.edu/~acoates/papers/coatesleeng_aistats_2011.pdf">https://www.cs.stanford.edu/~acoates/papers/coatesleeng_aistats_2011.pdf</a>.
/// <see cref="ITransformer"/> resulting from fitting a <see cref="LpNormNormalizingEstimator"/> or <see cref="GlobalContrastNormalizingEstimator"/>.
/// </summary>
public sealed class LpNormNormalizingTransformer : OneToOneTransformerBase
{
Expand Down Expand Up @@ -641,7 +631,7 @@ public static CommonOutputs.TransformOutput GcNormalize(IHostEnvironment env, Lp
}

/// <summary>
/// Base estimator class for LpNorm and Gcn normalizers.
/// Base estimator class for <see cref="LpNormNormalizingEstimator"/> and <see cref="GlobalContrastNormalizingEstimator"/> normalizers.
/// </summary>
public abstract class LpNormNormalizingEstimatorBase : TrivialEstimator<LpNormNormalizingTransformer>
{
Expand Down Expand Up @@ -805,8 +795,53 @@ public override SchemaShape GetOutputSchema(SchemaShape inputSchema)
}

/// <summary>
/// Lp Normalizing estimator takes columns and normalizes them individually by rescaling them to unit norm.
/// Normalizes (scales) vectors in the input column to the unit norm. The type of norm that is used can be specified by the user.
/// </summary>
/// <remarks>
/// <format type="text/markdown"><![CDATA[
///
/// ### Estimator Characteristics
/// | | |
/// | -- | -- |
/// | Does this estimator need to look at the data to train its parameters? | No |
/// | Input column data type | Vector of <xref:System.Single> |
/// | Output column data type | Vector of <xref:System.Single> |
///
///
/// The resulting <xref:Microsoft.ML.Transforms.LpNormNormalizingTransformer> normalizes vectors in the input column individually
/// by rescaling them to the unit norm.
///
/// Let $x$ be the input vector, $n$ the size of the vector, $L(x)$ the norm function selected by the user.
/// Let $\mu(x) = \sum_i x_i / n$ be the mean of the values of vector $x$. The <xref:Microsoft.ML.Transforms.LpNormNormalizingTransformer>
/// performs the following operation on each input vector $x$:
///
/// $y = \frac{x - \mu(x)}{L(x)}$
Copy link
Contributor Author

@artidoro artidoro Apr 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I should use the double dollar sign here? #Resolved

Copy link
Member

@najeeb-kazmi najeeb-kazmi Apr 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me, but the following may be better:

\begin{equation*}
<equation without $ signs>
\end{equation*}

P.S. Don't use $$ as it is plain TeX and may not render in LaTex


In reply to: 277183775 [](ancestors = 277183775)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the equations for the other norm funstions


In reply to: 277185443 [](ancestors = 277185443,277183775)

Copy link
Member

@najeeb-kazmi najeeb-kazmi Apr 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the \begin{equation*} would need the amsmath package available wherever the LaTeX is being rendered so maybe use \[ and \] around the equation instead (only if you decide to replace $, which I don't think is necessary). #Resolved

Copy link
Contributor

@natke natke Apr 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a markdown block, so I think you need the $ $ #Resolved

Copy link
Contributor Author

@artidoro artidoro Apr 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But single($equation$) or double ($$equation$$)?

I would like those to be centered. But it is a detail. #Resolved

///
/// if the user specifies that the mean should be zero, or otherwise:
///
/// $y = \frac{x}{L(x)}$
///
/// There are four types of norm that can be selected by the user to be applied on input vector $x$. They are defined as follows:
/// - <xref:Microsoft.ML.Transforms.LpNormNormalizingEstimatorBase.NormFunction.L1>
///
/// $L_1(x) = \sum_i |x_i|$
///
/// - <xref:Microsoft.ML.Transforms.LpNormNormalizingEstimatorBase.NormFunction.L2>
///
/// $L_2(x) = \sqrt{\sum_i x_i^2}$
///
/// - <xref:Microsoft.ML.Transforms.LpNormNormalizingEstimatorBase.NormFunction.Infinity>
///
/// $L_{\infty}(x) = \max_i\{|x_i|\}$
///
/// - <xref:Microsoft.ML.Transforms.LpNormNormalizingEstimatorBase.NormFunction.StandardDeviation>
///
/// $L_\sigma(x)$ is defined as the standard deviation of the elements of the input vector $x$
///
/// ]]>
/// </format>
/// </remarks>
/// <seealso cref="NormalizationCatalog.NormalizeLpNorm(TransformsCatalog, string, string, LpNormNormalizingEstimatorBase.NormFunction, bool)"/>
public sealed class LpNormNormalizingEstimator : LpNormNormalizingEstimatorBase
{
/// <summary>
Expand Down Expand Up @@ -861,8 +896,31 @@ internal LpNormNormalizingEstimator(IHostEnvironment env, params ColumnOptions[]
}

/// <summary>
/// Global contrast normalizing estimator takes columns and performs global constrast normalization.
/// Normalizes (scales) vectors in the input column applying the global contrast normalization.
/// </summary>
/// <remarks>
/// <format type="text/markdown"><![CDATA[
///
/// ### Estimator Characteristics
/// | | |
/// | -- | -- |
/// | Does this estimator need to look at the data to train its parameters? | No |
/// | Input column data type | Vector of <xref:System.Single> |
/// | Output column data type | Vector of <xref:System.Single> |
///
///
/// The resulting <xref:Microsoft.ML.Transforms.LpNormNormalizingTransformer> normalizes vectors in the input column individually,
/// rescaling them by applying global contrast normalization. The transform performs the following operation on each input vector $x$:
///
/// $y = \frac{s * x - \mu(x)}{L(x)}$
///
/// Where $s$ is a user provided scaling factor, $\mu(x)$ is the mean of the elements of vector $x$, and $L(x)$ is the $L_2$ norm or the
/// standard deviation of the elements of vector $x$. These settings can be specified by the user when the
/// <xref:Microsoft.ML.Transforms.GlobalContrastNormalizingEstimator> is initialized.
/// ]]>
/// </format>
/// </remarks>
/// <seealso cref="NormalizationCatalog.NormalizeGlobalContrast(TransformsCatalog, string, string, bool, bool, float)"/>
public sealed class GlobalContrastNormalizingEstimator : LpNormNormalizingEstimatorBase
{
/// <summary>
Expand Down
36 changes: 15 additions & 21 deletions src/Microsoft.ML.Transforms/NormalizerCatalog.cs
Original file line number Diff line number Diff line change
Expand Up @@ -280,20 +280,17 @@ internal static NormalizingEstimator Normalize(this TransformsCatalog catalog,
=> new NormalizingEstimator(CatalogUtils.GetEnvironment(catalog), columns);

/// <summary>
/// Takes column filled with a vector of floats and normalize its <paramref name="norm"/> to one. By setting <paramref name="ensureZeroMean"/> to <see langword="true"/>,
/// a pre-processing step would be applied to make the specified column's mean be a zero vector.
/// Normalizes (scales) vectors in the input column to the unit norm. The type of norm that is used is defined by <paramref name="norm"/>.
/// Setting <paramref name="ensureZeroMean"/> to <see langword="true"/>, will apply a pre-processing step to make the specified column's mean be a zero vector.
/// </summary>
/// <param name="catalog">The transform's catalog.</param>
/// <param name="outputColumnName">Name of the column resulting from the transformation of <paramref name="inputColumnName"/>.
/// The data type on this column is the same as the input column.</param>
/// <param name="inputColumnName">Name of column to transform. If set to <see langword="null"/>, the value of the <paramref name="outputColumnName"/> will be used as source.</param>
/// <param name="norm">Type of norm to use to normalize each sample. The indicated norm of the resulted vector will be normalized to one.</param>
/// This column's data type will be the same as the input column's data type.</param>
/// <param name="inputColumnName">Name of the column to normalize. If set to <see langword="null"/>, the value of the
/// <paramref name="outputColumnName"/> will be used as source.
/// This estimator operates over known-sized vectors of <see cref="System.Single"/>.</param>
/// <param name="norm">Type of norm to use to normalize each sample. The indicated norm of the resulting vector will be normalized to one.</param>
/// <param name="ensureZeroMean">If <see langword="true"/>, subtract mean from each value before normalizing and use the raw input otherwise.</param>
/// <remarks>
/// This transform performs the following operation on a each row X: Y = (X - M(X)) / D(X)
/// where M(X) is scalar value of mean for all elements in the current row if <paramref name="ensureZeroMean"/>set to <see langword="true"/> or <value>0</value> othewise
/// and D(X) is scalar value of selected <paramref name="norm"/>.
/// </remarks>
/// <example>
/// <format type="text/markdown">
/// <![CDATA[
Expand All @@ -315,22 +312,19 @@ internal static LpNormNormalizingEstimator NormalizeLpNorm(this TransformsCatalo
=> new LpNormNormalizingEstimator(CatalogUtils.GetEnvironment(catalog), columns);

/// <summary>
/// Takes column filled with a vector of floats and computes global contrast normalization of it. By setting <paramref name="ensureZeroMean"/> to <see langword="true"/>,
/// a pre-processing step would be applied to make the specified column's mean be a zero vector.
/// Normalizes columns individually applying global contrast normalization. Setting <paramref name="ensureZeroMean"/> to <see langword="true"/>,
/// will apply a pre-processing step to make the specified column's mean be the zero vector.
/// </summary>
/// <param name="catalog">The transform's catalog.</param>
/// <param name="outputColumnName">Name of the column resulting from the transformation of <paramref name="inputColumnName"/>.
/// The data type on this column is the same as the input column.</param>
/// <param name="inputColumnName">Name of column to transform. If set to <see langword="null"/>, the value of the <paramref name="outputColumnName"/> will be used as source.</param>
/// This column's data type will be the same as the input column's data type.</param>
/// <param name="inputColumnName">Name of the column to normalize. If set to <see langword="null"/>, the value of the
/// <paramref name="outputColumnName"/> will be used as source.
/// This estimator operates over known-sized vectors of <see cref="System.Single"/>.</param>
/// <param name="ensureZeroMean">If <see langword="true"/>, subtract mean from each value before normalizing and use the raw input otherwise.</param>
/// <param name="ensureUnitStandardDeviation">If <see langword="true"/>, resulted vector's standard deviation would be one. Otherwise, resulted vector's L2-norm would be one.</param>
/// <param name="ensureUnitStandardDeviation">If <see langword="true"/>, the resulting vector's standard deviation would be one.
/// Otherwise, the resulting vector's L2-norm would be one.</param>
/// <param name="scale">Scale features by this value.</param>
/// <remarks>
/// This transform performs the following operation on a row X: Y = scale * (X - M(X)) / D(X)
/// where M(X) is scalar value of mean for all elements in the current row if <paramref name="ensureZeroMean"/>set to <see langword="true"/> or <value>0</value> othewise
/// D(X) is scalar value of standard deviation for row if <paramref name="ensureUnitStandardDeviation"/> set to <see langword="true"/> or
/// L2 norm of this row vector if <paramref name="ensureUnitStandardDeviation"/> set to <see langword="false"/> and scale is <paramref name="scale"/>.
/// </remarks>
/// <example>
/// <format type="text/markdown">
/// <![CDATA[
Expand Down