-
Notifications
You must be signed in to change notification settings - Fork 60
Training time finished without any models trained. #596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Same error after 4000 seconds (>1hr) Output pane doesn't give much more detail | Trainer MicroAccuracy MacroAccuracy Duration #Iteration | |
But worked on 100 seconds on 10000 rows, so extrapolating, it would take 17 hours to train on 6 million rows. The issue here is just the fact that the suggestted time is completely underestimated in the Model Builder. |
If anyone in Microsoft is reading these issues, I'd love to put forward a suggestion here - If someone has enterered a time that is insufficient to train a model, it would be better if it could have a "Resume" option, instead of crashing out. On large data sets, it could be hours before a model could be trained. |
Sorry for the late reply. |
Hi @LittleLittleCloud - Nice cat :) The training data was text based, and the PC is a few years old. My only suggestion, is that instead of this error appearing, is that if it could perhaps suggest to continue training the model for x more minutes?, rather than crashing, and having the user re-start the training process. I would feel that people would prefer the model to complete training, rather than see and error message, even if the training is going to take longer than expected. I've also seen cases where the training has cancelled, without me pressing the "cancel" button, but I'm not sure how that happens, but it is equally annoying. |
@infiniteloopltd : What's the size of your dataset in MB? What is your task? (regression/classification) If classification, how many classes do you have? The number of classes has a large effect on runtime, as for most of our trainers, it multiplies the amount of work needed. Text datasets are expected to take longer than categorical or numeric. As an example runtime, to create the first model on a text dataset of 5.5GB and 19M rows and 200 classes, it took me 34 hours to get the first model. Most of the runtime is due to the high number of classes (200). This was run on an old but large machine (circa 2013 12-core/24-threads, 256GB RAM). On that run, 77GB of RAM was used (as dataset caching was enabled, otherwise ~0GB). If you go beyond physical RAM into virtual memory, it will be quite slow due to thrashing as the featurized dataset streams line-by-line from the cache which is now swapped out of RAM onto disk. Suggestions for Model Builder:
|
@infiniteloopltd I'm going to close this issue for clean up portal, should you have any questions, feel free to re-open it. |
System Information (please complete the following information):
Describe the bug
Creating a classification model on 6 million rows, suggested time 1800 seconds.
To Reproduce
Can't share data, sorry.
Expected behavior
Model to be created
Screenshots
Training time finished without any models trained.
at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.d__23.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLEngine.d__28.MoveNext()
The text was updated successfully, but these errors were encountered: