Skip to content

Metadata utils internalization, migration of few useful methods #2651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 20, 2019

Conversation

TomFinley
Copy link
Contributor

@TomFinley TomFinley commented Feb 20, 2019

Fixes #2622 by limiting the public surface area of metadata/annotations conveniences to a few places.

Note that that migrated methods are not exactly the same as the existing methods in the case of querying whether a key column type exists, since the existing methods predated the schema column type and therefore had some unnecessary parameters. Though I decided to expand it a bit so as to allow checks on whether other types of key values are present. Not sure if that's necessary or helpful.

Also internalized the rather innocuous but nonetheless probably unnecessary metadata builder extensions.

Commits structured incrementally as usual for the small steps.

@TomFinley TomFinley added the API Issues pertaining the friendly API label Feb 20, 2019
@codecov
Copy link

codecov bot commented Feb 20, 2019

Codecov Report

Merging #2651 into master will decrease coverage by <.01%.
The diff coverage is 82.92%.

@@            Coverage Diff             @@
##           master    #2651      +/-   ##
==========================================
- Coverage   71.54%   71.54%   -0.01%     
==========================================
  Files         800      801       +1     
  Lines      141848   141850       +2     
  Branches    16119    16120       +1     
==========================================
- Hits       101482   101481       -1     
- Misses      35916    35917       +1     
- Partials     4450     4452       +2
Flag Coverage Δ
#Debug 71.54% <82.92%> (-0.01%) ⬇️
#production 67.83% <82.92%> (-0.01%) ⬇️
#test 85.73% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...icrosoft.ML.Core/Data/MetadataBuilderExtensions.cs 100% <ø> (ø) ⬆️
...Microsoft.ML.Data/Transforms/MetadataDispatcher.cs 61.86% <ø> (ø) ⬆️
...icrosoft.ML.Data/EntryPoints/PredictorModelImpl.cs 72.22% <0%> (ø) ⬆️
...rc/Microsoft.ML.Data/Transforms/InvertHashUtils.cs 96.06% <100%> (ø) ⬆️
src/Microsoft.ML.Data/Evaluators/EvaluatorUtils.cs 73.59% <100%> (ø) ⬆️
src/Microsoft.ML.Transforms/Text/NgramTransform.cs 87.74% <100%> (ø) ⬆️
src/Microsoft.ML.Core/Data/MetadataUtils.cs 82.29% <50%> (-0.58%) ⬇️
...L.Data/Evaluators/MultiClassClassifierEvaluator.cs 77.69% <80%> (+0.03%) ⬆️
...rosoft.ML.Data/Data/SchemaAnnotationsExtensions.cs 88% <88%> (ø)
...StandardLearners/Standard/LinearModelParameters.cs 60.63% <0%> (-0.27%) ⬇️
... and 3 more

@@ -998,16 +998,17 @@ private protected override IEnumerable<string> GetPerInstanceColumnsToSave(RoleM
// Multi-class evaluator adds four per-instance columns: "Assigned", "Top scores", "Top classes" and "Log-loss".
private protected override IDataView GetPerInstanceMetricsCore(IDataView perInst, RoleMappedSchema schema)
{
// If the label column is a key without key values, convert it to I8, just for saving the per-instance
// If the label column is a key without text key values, convert it to I8, just for saving the per-instance

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the code actually converts to R8, not I8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh. Delightful... seems a pity to update such a simple PR for an internal class. How about this, I'll locally fix it, and if for some other reason I need to update the PR it'll go in, otherwise it'll just go into my next PR.

Copy link

@yaeldekel yaeldekel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shauheen shauheen added this to the 0219 milestone Feb 20, 2019
@TomFinley TomFinley requested a review from wschin February 20, 2019 21:12
Copy link
Member

@codemzs codemzs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

/// </summary>
public static bool IsNormalized(this DataViewSchema.Column column)
{
var metaColumn = column.Metadata.Schema.GetColumnOrNull((MetadataUtils.Kinds.IsNormalized));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(super nit): double params here. Can remove one set.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:


[BestFriend]
internal static bool HasKeyValues(this SchemaShape.Column col)
public static bool HasKeyValues(this SchemaShape.Column col)
Copy link
Member

@eerhardt eerhardt Feb 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my information - what's the difference between this method and the public static bool HasKeyValues in SchemaAnnotationsExtensions?

I can see they contain different code, but they are named the same...?

@TomFinley TomFinley merged commit bd64b88 into dotnet:master Feb 20, 2019
@TomFinley TomFinley deleted the Meta branch February 20, 2019 23:15
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Issues pertaining the friendly API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants