Skip to content

Bring back support for ConfusionMatrix in the new API #2009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
abgoswam opened this issue Jan 3, 2019 · 3 comments · Fixed by #3250
Closed

Bring back support for ConfusionMatrix in the new API #2009

abgoswam opened this issue Jan 3, 2019 · 3 comments · Fixed by #3250
Assignees
Labels
enhancement New feature or request

Comments

@abgoswam
Copy link
Member

abgoswam commented Jan 3, 2019

While updating tests from the old AP to the new API, we found that the new API does not support ConfusionMatrix in the evaluation output.

public MultiClassClassifierMetrics Evaluate(IDataView data, string label, string score, string predictedLabel)
{
Host.CheckValue(data, nameof(data));
Host.CheckNonEmpty(label, nameof(label));
Host.CheckNonEmpty(score, nameof(score));
Host.CheckNonEmpty(predictedLabel, nameof(predictedLabel));
var roles = new RoleMappedData(data, opt: false,
RoleMappedSchema.ColumnRole.Label.Bind(label),
RoleMappedSchema.CreatePair(MetadataUtils.Const.ScoreValueKind.Score, score),
RoleMappedSchema.CreatePair(MetadataUtils.Const.ScoreValueKind.PredictedLabel, predictedLabel));
var resultDict = ((IEvaluator)this).Evaluate(roles);
Host.Assert(resultDict.ContainsKey(MetricKinds.OverallMetrics));
var overall = resultDict[MetricKinds.OverallMetrics];

In the Legacy API, there was support for ConfusionMatrix

public static partial class Evaluate
{
[TlcModule.EntryPoint(Name = "Models.ClassificationEvaluator", Desc = "Evaluates a multi class classification scored dataset.")]
public static CommonOutputs.ClassificationEvaluateOutput MultiClass(IHostEnvironment env, MultiClassMamlEvaluator.Arguments input)
{
Contracts.CheckValue(env, nameof(env));
var host = env.Register("EvaluateMultiClass");
host.CheckValue(input, nameof(input));
EntryPointUtils.CheckInputArgs(host, input);
MatchColumns(host, input, out string label, out string weight, out string name);
IMamlEvaluator evaluator = new MultiClassMamlEvaluator(host, input);
var data = new RoleMappedData(input.Data, label, null, null, weight, name);
var metrics = evaluator.Evaluate(data);
var warnings = ExtractWarnings(host, metrics);
var overallMetrics = ExtractOverallMetrics(host, metrics, evaluator);
var perInstanceMetrics = evaluator.GetPerInstanceMetrics(data);
var confusionMatrix = ExtractConfusionMatrix(host, metrics);

We should add back the checks for ConfusionMatrix in the following tests :

  • TrainAndPredictIrisModelTest
  • TrainAndPredictIrisModelWithStringLabelTest

@codemzs @Ivanidzo4ka

@bartczernicki
Copy link

+1 this is really silly that we don't have a Confusion Matrix output. I realize this can be calculated on your own, but this one of the fundamental performance constructs that get reported in most ML software/libraries.

@Ivanidzo4ka Ivanidzo4ka added the enhancement New feature or request label Feb 28, 2019
@Ivanidzo4ka
Copy link
Contributor

We have one month before we lock our code in preparation for 1.0 release. It's a nice thing to have, but we just don't have time to focus on this right now.
Unless you think it's a game changer and necessary to have.

@bartczernicki
Copy link

@Ivanidzo4ka I completely understand there is 1.0 pressure. I think having core performance conventions and explain-ability aligned with basic ML software should be prioritized for post 1.0.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants