-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add Benchmark test for PredictionEngine #1014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can you build the single prediction benchmarks from the two models produced in the These will be more representative of user models by being a bit larger. You can take the SetupScoringSpeedTests() code to produce these on demand (and we should). The current models in this PR are quite small. We can include a very small one also; the small model focuses on overhead in the scoring process beyond the featurization and learners. Having only tiny models would focus our energy on improving the speed of components not very representative of what users see taking time in their prediction pipeline. I would recommend measuring:
cc: @markusweimer |
@justinormont why we are storing these trained model in the repo , shouldnt we produce them in GlobalSetup ? |
I agree. Perhaps we could put the new code within the existing files (so we don't have to replicate the GlobalSetup sections? |
can be measured using the launchCount feature, similar to the way we are measuring time for training the model
how about putting the common code https://github.com/dotnet/machinelearning/blob/master/test/Microsoft.ML.Benchmarks/Helpers.cs
which latency are we talking about here ? |
I think you might be right. :) A PR titled "Benchmark test for PredictionEngine" where the main file added is Your other notes about timings and whatnot seem OK, except for the note about "multi-threaded is recommended." That seems more like a scenario that is best served by measuring the batch prediction engine or even dataview pipelines itself, not a simple prediction engine. Let's focus on getting the simple things right. Subsequent PRs can maybe refine. In reply to: 424195591 [](ancestors = 424195591) |
dd611f1
to
18c158c
Compare
18c158c
to
4a6e31e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM , just post the numbers on the pR or issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great now thanks much @najeeb-kazmi .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion, after reading the above, is:
|
Number 2 should not happen in this PR In reply to: 424802788 [](ancestors = 424802788) |
Thanks @najeeb-kazmi for the great benchmarks. The multi-threaded case I was envisioning was to test how well we scale for the user scenario of running a web server handling ML prediction requests. When building a web service like this, it's good to be able to handle concurrent requests among multiple threads (multiple worker processes is another route). I haven't read the new PredictionEngine code to know if we do anything to either help/hinder multi-threaded predictions. |
Adds a benchmark test to measure performance of doing many single predictions with PredictionEngine.
Closes #1013