ML.NET CLI performance with more cores #645

bsoulier · 2020-03-31T00:13:29Z

Is your feature request related to a problem? Please describe.
No

Describe the solution you'd like
Would the ML.NET CLI benefit from having more cores at its disposal when training on big amount of data (meaning quicker time to compute a model)?

Describe alternatives you've considered
N/A

Additional context
N/A

LittleLittleCloud · 2020-04-01T17:09:07Z

@justinormont Will we get bonus training using AutoML.Net when we have more cores

justinormont · 2020-04-01T19:54:16Z

Yes ML․NET is multi-threaded, though its use in AutoML should be improved.

See dotnet/machinelearning#4092:

The CPU utilization will depend on the trainer running.

Few optimizations we should make:

Parallel sweeping - https://github.com/dotnet/machinelearning-automl/issues/393

Set NumberOfThreads in each trainer - https://github.com/dotnet/machinelearning-automl/issues/394

Cache featurized dataset before trainer and/or cache trained transforms (for re-use between pipelines) - https://github.com/dotnet/machinelearning-automl/issues/396

Cache trained sub-pipelines between runs - https://github.com/dotnet/machinelearning-automl/issues/396

Apologies for the links to a non-public repo. We still have to transfer the issues to the current repo.

The NumberOfThreads task would be quite easy to implement (if it's even needed anymore). Part of that would be to check if ML․NET is currently defaulting a reasonable number of threads for the transforms, trainers, and the pipeline. Some background on threading in ML․NET: dotnet/machinelearning#136, dotnet/machinelearning#135, dotnet/machinelearning#277). I'm not sure which solution it landed on before releasing v1.0.

In addition to the above, for NUMA architectures, we may want to enable: (info)

<configuration>
   <runtime>
      <gcServer enabled="true"/>
      <Thread_UseAllCpuGroups  enabled="true"/>
   </runtime>
</configuration>

I'm unsure if we can set Thread_UseAllCpuGroups & gcServer for the AutoML API due to being a library, but we can for the CLI & Model Builder. This should also be tested, as training a single model within a single NUMA group may be faster than incurring the cross group latency. After implementing parallel sweeping, the optimal solution might be training one model per NUMA group.

LittleLittleCloud · 2020-04-07T17:49:25Z

I'm going to close this issue for clear up, feel free to re-open it whenever you have questions.

LittleLittleCloud added this to the May 2020 milestone Apr 1, 2020

LittleLittleCloud self-assigned this Apr 1, 2020

LittleLittleCloud closed this as completed Apr 7, 2020

luisquintanilla mentioned this issue Oct 7, 2022

How do I max out CPU cores with AutoML? dotnet/machinelearning#4092

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML.NET CLI performance with more cores #645

ML.NET CLI performance with more cores #645

bsoulier commented Mar 31, 2020

LittleLittleCloud commented Apr 1, 2020

justinormont commented Apr 1, 2020

LittleLittleCloud commented Apr 7, 2020

ML.NET CLI performance with more cores #645

ML.NET CLI performance with more cores #645

Comments

bsoulier commented Mar 31, 2020

LittleLittleCloud commented Apr 1, 2020

justinormont commented Apr 1, 2020

LittleLittleCloud commented Apr 7, 2020