Skip to content

Suggestion - Make Machine Learning Models explainable by design with ML.NET #511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tauheedul opened this issue Jul 9, 2018 · 7 comments
Labels
enhancement New feature or request usability Smoothing user interaction or experience

Comments

@tauheedul
Copy link
Contributor

tauheedul commented Jul 9, 2018

It's often difficult to understand how Machine Learning applications come to a decision. Some Developers reuse model samples without knowing how it works and is considered a black box to many.

This is an opportunity for ML.NET to stand out and automatically make models explainable.

  • ML.NET framework could keep a stack trace of some kind that keeps an audit of decisions
  • Including how confident it was in that decision (a rating or percentage)
  • With a fairness rating, evaluating the bias contained in the data supplied to the model
  • This could be output to the application upon request. Much like you can output a trace of an Exception.
  • Extend these peek abilities in Visual Studio so you can inspect what 3rd party models are doing (just like Resharpers decompile capabilities with libraries)

A framework that automatically keeps a self-audit of decisions would be way ahead of the rest and could help developers understand what the model is doing under the hood. Especially if they are relying on models supplied by third parties.

This could boost the development of ML using ML.NET and is exactly the kind of thing that made .NET such an easy framework to work with.

@tauheedul
Copy link
Contributor Author

This issue is for ML.NET keeping an audit of decisions, evaluating bias/fairness and scores given by ML.NET

The following related issues raise some of the points I've highlighted but not all...
#599 - Feature importance with ML.NET
#695 - Enable scoring of ONNX models in ML.NET
#696 - Enable scoring of TensorFlow models with ML.NET

@Ivanidzo4ka Ivanidzo4ka added enhancement New feature or request usability Smoothing user interaction or experience labels Oct 18, 2018
@shauheen
Copy link
Contributor

shauheen commented Dec 6, 2018

@rogancarr can you please look into this issue.

@rogancarr
Copy link
Contributor

rogancarr commented Dec 18, 2018

Hi @tauheedul ,

Thanks for the suggestions. There are a lot of great ideas in here. Let me try to decompose them into specific issues.

[ML.NET] could keep a stack trace of some kind that keeps an audit of decisions
Including how confident it was in that decision (a rating or percentage)

For classification, we have probabilistic classifiers that already satisfy this need (e.g. 95% probability of class 1). For other types of models, it is possible to build predictions with confidence intervals, but that currently requires a lot of hand tuning and requires statistical training. It could be interesting to provide support for this in general.
Ask: Include confidence intervals for non-probabilistic scores. (#1906)

  • With a fairness rating, evaluating the bias contained in the data supplied to the model
  • This could be output to the application upon request. Much like you can output a trace of an Exception.

Ask: Add metadata to the model with bias and fairness metrics for the training dataset.
- Requires metadata fields in the model (#1915)
- Requires tools for measuring bias metrics over datasets (#1913)

This is helpful for analysis when developing the model, but the bias in the training data doesn't always translate evenly to the bias and fairness of the models themselves, so what we probably actually want is:
Ask: Add metadata with measures of bias and fairness in the models themselves (#1912)
- Requires bias & fairness evaluation metrics (#1911)

And of course, we probably also want the usual set of evaluation metrics as well:
Ask: Add metadata with evaluation metrics in the models themselves (#1908)

And implicit in the discussion is this idea that people are handing off models to each other, so we may want a way of signing a model to guarantee that the metadata and model parameters haven't been modified.
Ask: Add the ability to sign models and guarantee that model parameters and metadata have not been modified. (#1916)

Extend these peek abilities in Visual Studio so you can inspect what 3rd party models are doing (just like Resharpers decompile capabilities with libraries)

Ask: Allow VS to pull out metadata on the model and display it during editing. (Attached to #1915)

… keeps an audit of decisions …

Ask: Allow runtime logging and auditing.
I think this is out of scope for ML.NET, but it would be nice in general to have a framework for logging and analysis of the output of ML models once they are deployed.

I think this is a great set of feature requests for both debugging purposes, explainability, and the bigger concepts of bias, fairness, accountability, and transparency in ML. I'll file separate issues on each of these.

@rogancarr
Copy link
Contributor

@tauheedul Do these issues represent what you have proposed, or have I missed anything?

@tauheedul
Copy link
Contributor Author

tauheedul commented Dec 19, 2018

@tauheedul Do these issues represent what you have proposed, or have I missed anything?

Hello @rogancarr thank you for building these into actionable work items. These cover them perfectly for Classical Machine Learning models.

From the perspective of a Deep Learning model e.g. a TensorFlow or ONNX model, would it be possible to peek into the neural network nodes? (not only a high-level feature importance score but peek into what each node is doing e.g. you can view feature importance at each node level)

In the Visual Studio IDE we could have a Model with a tree of neurons (maybe thousands)... each neuron can be expanded and the evaluation of what each node is doing can be inspected.

This can open up the opportunity to allow developers to create custom graphing for visual feedback and monitor the progression of a decision in real-time.

It allows you to trace which neurons are fired in any given execution of the model and how a given set of data impacted the score and at what stage it changed.

Michał Łopuszyński - Data Scientist, University of Warsaw has built up a list of excellent resources covering these topics
https://github.com/lopusz/awesome-interpretable-machine-learning

@rogancarr
Copy link
Contributor

@tauheedul Thanks for the great feedback on this. I'm passing your comments on visualization to @rustd for ModelBuilder.

I'll close this issue now. Feel free to re-open if you want to discuss more.

@tauheedul
Copy link
Contributor Author

@tauheedul Thanks for the great feedback on this. I'm passing your comments on visualization to @rustd for ModelBuilder.

I'll close this issue now. Feel free to re-open if you want to discuss more.

Hi @rogancarr @rustd I have created a new issue for the visualization suggestions in #2879
Thank you!

@ghost ghost locked as resolved and limited conversation to collaborators Mar 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request usability Smoothing user interaction or experience
Projects
None yet
Development

No branches or pull requests

5 participants