Skip to content

NaN metric value handling in AutoML #4663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
justinormont opened this issue Jan 16, 2020 · 4 comments · Fixed by #5031
Closed

NaN metric value handling in AutoML #4663

justinormont opened this issue Jan 16, 2020 · 4 comments · Fixed by #5031
Assignees
Labels
AutoML.NET Automating various steps of the machine learning process bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.

Comments

@justinormont
Copy link
Contributor

AutoML API code is not handing NaN values for metrics. During the sweep, when a model returns a NaN value for the metric being optimized, AutoML crashes.

See background: #4648 (comment)

@justinormont justinormont added AutoML.NET Automating various steps of the machine learning process bug Something isn't working labels Jan 16, 2020
@justinormont justinormont changed the title NaN metric value handing in AutoML NaN metric value handling in AutoML Jan 17, 2020
@harishsk
Copy link
Contributor

@justinormont You suggested fixing two things in #4648 . The second one is being fixed in ML.NET. Does the first work item still need to be done as part of this bug fix? If so, can you please add some repro steps for this bug?

@harishsk harishsk added the P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. label Jan 22, 2020
@justinormont
Copy link
Contributor Author

justinormont commented Jan 22, 2020

@CBrauer has a repro in bug.zip from his original bug report: #4648 (comment)

Hello,

I just upgraded my project to the new pre-release versions of ML.NET and I got the following error message when I ran my program:

bug

I have added a Zip file of my program and dataset. I hope you guys can help me find out why I'm getting this error,

Charles

bug.zip

In this example, he is optimizing towards the F1 metric, which can currently be NaN. The AutoML code crashes when it receives the NaN within the metric its optimizing towards.

... Does the first work item still need to be done as part of this bug fix?

The AutoML code does need to be robust to NaN values for its optimization metric. NaN values are the expected values at times.

Another way to reproduce is in a debugger and replacing the model's returned metric w/ NaN.

@harishsk harishsk self-assigned this Mar 26, 2020
@harishsk
Copy link
Contributor

harishsk commented Apr 1, 2020

@CBrauer The attached zip file did not contain the csv files for validation and test. I reduced the training file by 40% and created two new files for validation and test. With that, I have not been able to reproduce the issue you are seeing.

Can you please update the zip file with the necessary files that reproduce the issue?

@harishsk harishsk assigned najeeb-kazmi and unassigned harishsk Apr 3, 2020
@najeeb-kazmi
Copy link
Member

najeeb-kazmi commented Apr 3, 2020

We don't need to split the file into train, validation, and test. AutoML does the split internally. In this case, the training set is used as the validation set just to evaluate metrics from the best AutoML run. The choice of dataset is unrelated to AutoML training, which is the relevant part of the code for this bug.

I can reproduce the error with the data and code provided. It uses 1.5.0-preview and 0.17.0-preview, which do not have the fix for F1 score returning 0 instead of Nan from #4674. The fix for F1 is there in preview2.

I'll look at how NaN metrics can be handled in AutoML. F1 no longer returns NaN, but LogLossReduction can still return NaN #4648 (comment). I'll look at whether NaN in LogLossReduction can be handled, or if AutoML should generally handle NaNs, or both.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AutoML.NET Automating various steps of the machine learning process bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants