Skip to content

Conversation

lantiga
Copy link
Contributor

@lantiga lantiga commented Feb 15, 2020

This PR reopens #241, which was merged and then reverted.

Issues not covered by tests emerged and need to be resolved before we merge again.
Details will follow here.

Add support for inspection and eviction to queue

Mock run info batching

Mock run info batching

Make TF tests work

Add batching for ONNX and ONNX-ML

Fix torch API, still WIP

Fix torch backend

Fixes after rebasing

Add auto-batching to TFLite backend

Fix from rebase

Add batching args to command and change API accordingly

Add batching heuristics [WIP]

Fix TFLite test by accessing first tensor in first batch safely

Temporarily comment out wrong_bg test check

Implement batching heuristics

Introduce autobatch tests, tflite still fails

Fix segfault when error was generated from the backend

Fix tflite autobatch test

Updated documentation with auto batching

Remove stale comments

Avoid making extra copies of inputs and outputs when batch count is 1

Address review comments re const-correctness

Add tests to detect failures

Fix slicing and concatenation

Fix tensor slicing and concatenating

Temporarily disable tflite autobatch test due to tflite limitation

Disable support for autobatching for TFLITE
@codecov
Copy link

codecov bot commented Mar 8, 2020

Codecov Report

Merging #270 into master will increase coverage by 2.51%.
The diff coverage is 88.67%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #270      +/-   ##
==========================================
+ Coverage   53.28%   55.79%   +2.51%     
==========================================
  Files          25       25              
  Lines        4634     5022     +388     
==========================================
+ Hits         2469     2802     +333     
- Misses       2165     2220      +55
Impacted Files Coverage Δ
src/redisai.h 0% <0%> (ø) ⬆️
src/err.c 91.66% <100%> (+0.36%) ⬆️
src/backends.c 62.8% <100%> (ø) ⬆️
src/backends/tflite.c 72.07% <56.52%> (-13.83%) ⬇️
src/backends/onnxruntime.c 69.1% <88.76%> (+4.02%) ⬆️
src/model.c 66.66% <90.16%> (+3.44%) ⬆️
src/redisai.c 76.57% <92%> (+2.05%) ⬆️
src/backends/tensorflow.c 70.68% <95%> (+2.93%) ⬆️
src/backends/torch.c 84.66% <95.12%> (+2.02%) ⬆️
src/tensor.c 80.84% <95.91%> (+2.05%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 62b2fa2...04073b5. Read the comment docs.

@lantiga
Copy link
Contributor Author

lantiga commented Mar 8, 2020

Todo:

@lantiga
Copy link
Contributor Author

lantiga commented Mar 14, 2020

Valgrind looks legit, I can't reproduce the crash anymore.
@filipecosta90 ready to review and merge

@lantiga
Copy link
Contributor Author

lantiga commented Mar 20, 2020

@filipecosta90 @hhsecond ready for review/approval

Sherin Thomas and others added 2 commits March 28, 2020 02:45
* test cases for crash test

* Fix issue with evict. Port test to multiprocessing to allow killing pending command.

* Use terminate instead of kill

Co-authored-by: Luca Antiga <[email protected]>
@lantiga
Copy link
Contributor Author

lantiga commented Mar 28, 2020

I fixed a last (embarassing) bug in queue eviction (#310).
@hhsecond @filipecosta90 I'll be waiting for a last check before merging, let's try to make it happen as soon as we can.

Copy link

@hhsecond hhsecond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look good to me

@lantiga lantiga merged commit 8752589 into master Mar 30, 2020
lantiga added a commit that referenced this pull request May 6, 2020
 
Add support for batching (take two) (#270)

* Add support for automated batching

Add support for inspection and eviction to queue

Mock run info batching

Mock run info batching

Make TF tests work

Add batching for ONNX and ONNX-ML

Fix torch API, still WIP

Fix torch backend

Fixes after rebasing

Add auto-batching to TFLite backend

Fix from rebase

Add batching args to command and change API accordingly

Add batching heuristics [WIP]

Fix TFLite test by accessing first tensor in first batch safely

Temporarily comment out wrong_bg test check

Implement batching heuristics

Introduce autobatch tests, tflite still fails

Fix segfault when error was generated from the backend

Fix tflite autobatch test

Updated documentation with auto batching

Remove stale comments

Avoid making extra copies of inputs and outputs when batch count is 1

Address review comments re const-correctness

Add tests to detect failures

Fix slicing and concatenation

Fix tensor slicing and concatenating

Temporarily disable tflite autobatch test due to tflite limitation

Disable support for autobatching for TFLITE

* Fix TFLite and tests after rebase

* Temporarily disable macos CI build

* Add synchronization to autobatch tests

* Add synchronization to autobatch thread

* Add synchronization to autobatch thread

* Batching crashtest (#310)

* test cases for crash test

* Fix issue with evict. Port test to multiprocessing to allow killing pending command.

* Use terminate instead of kill

Co-authored-by: Luca Antiga <[email protected]>

Co-authored-by: Sherin Thomas <[email protected]>
status = ort->GetAllocatorWithDefaultOptions(&allocator);
if (status != NULL) {
goto error;
return NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason returning a status is ignored but line 437 it isn't?

@gkorland gkorland deleted the batching branch October 6, 2020 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants