Skip to content

Update Permutation Feature Importance Samples #3247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 10, 2019

Conversation

rogancarr
Copy link
Contributor

This PR updates the samples for PermutationFeatureImportance and adds samples for MulticlassClassification and Ranking tasks.

Fixes #3242

@codecov
Copy link

codecov bot commented Apr 8, 2019

Codecov Report

Merging #3247 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3247      +/-   ##
==========================================
- Coverage   72.62%   72.62%   -0.01%     
==========================================
  Files         807      807              
  Lines      145080   145080              
  Branches    16213    16213              
==========================================
- Hits       105369   105368       -1     
  Misses      35294    35294              
- Partials     4417     4418       +1
Flag Coverage Δ
#Debug 72.62% <ø> (-0.01%) ⬇️
#production 68.17% <ø> (-0.01%) ⬇️
#test 88.93% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...ansforms/PermutationFeatureImportanceExtensions.cs 97.93% <ø> (ø) ⬆️
...StandardTrainers/Standard/LinearModelParameters.cs 60.05% <0%> (-0.27%) ⬇️
...rc/Microsoft.ML.Transforms/CustomMappingCatalog.cs 100% <0%> (ø) ⬆️
src/Microsoft.ML.Transforms/ExtensionsCatalog.cs 57.14% <0%> (ø) ⬆️

using System.Collections.Generic;
using System.Linq;

namespace Microsoft.ML.Samples.Dynamic.Trainers.BinaryClassification
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we tend to have namespace Samples.Dynamic #Resolved

// Fit the pipeline to the data.
var model = pipeline.Fit(data);

// Compute the permutation metrics for the linear model using the normalized data.
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Compute the permutation metrics for the linear model using the normalized data. [](start = 12, length = 82)

put it before var permutationMetrics =? #Resolved

Data.FeatureColumns[i],
linearPredictor.Model.SubModel.Weights[i],
auc[i].Mean,
1.96 * auc[i].StandardError);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.96 * auc[i].StandardError [](start = 20, length = 27)

const number magic! I'm scared and confused! #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic number to convert the interval to the 95%-percentile.


In reply to: 273720232 [](ancestors = 273720232)

// Expected output:
// Feature Model Weight Change in RMSE 95% Confidence in the Mean Change in RMSE
// Feature2 9.00 4.009 0.008304
// Feature1 4.48 1.901 0.003351
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would align Expected output and Feature.
One space instead of two.
Same across all files. #ByDesign

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two spaces for printed verbatim output is the style across the Samples project.


In reply to: 273720618 [](ancestors = 273720618)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:


private static double Sigmoid(double x)
{
return 1.0 / (1.0 + Math.Exp(-1 * x));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could remove the braces, since it's just one line, and use =>

public static readonly string[] FeatureColumns = new string[] { nameof(Feature1), nameof(Feature2) };

public static IEnumerable<Data> GenerateData(int nExamples = 10000,
double bias = 0, double weight1 = 1, double weight2 = 2, int seed = 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a one line explanation of how you are generating the data?


public float Feature1 { get; set; }

public float Feature2 { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually the class that defines the data does not do anything else. I like that pattern, because it makes it easy for users to understand what is happening.
I would suggest to move the GenerateData method outside of the Data class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly for Sigmoid.


In reply to: 273741337 [](ancestors = 273741337)

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, please do address the comments about the data generation code.

@rogancarr rogancarr merged commit e0c029c into dotnet:master Apr 10, 2019
@rogancarr rogancarr deleted the 3242_PFI_Docs branch April 10, 2019 22:44
rogancarr added a commit to rogancarr/machinelearning that referenced this pull request Apr 11, 2019
* Updating PFI Docs.

(cherry picked from commit e0c029c)
@ghost ghost locked as resolved and limited conversation to collaborators Mar 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants