Feature/batched inference slicer
Description
PR for inference slicer with batching. If max_batch_size is set, collections of slices are passed to the model.
Still using threads if worker_threads>=2 (both batches and threads can be used simultaneously)
Type of change
- [x] New feature (non-breaking change which adds functionality)
How has this change been tested, please provide a testcase or example of how you tested the change?
https://colab.research.google.com/drive/1j85QErM74VCSLADoGliM296q4GFUdnGM?usp=sharing
Any specific deployment considerations
- [ ] Docs updated? What were the changes:
As you can see, in these tests it only helped the Ultralytics case.
Known insufficiencies:
- ~~Inference 1 model is fit for vehicle detection but is tested on an image with people.~~
- ~~No image to check how well it performed.~~
- ~~No tests for auto-batch case (when max_batch_size=-1).~~
- ~~Missing examples in dosctring: normal vs batch callback~~
Ready for review.
Tests in aforementioned Colab: https://colab.research.google.com/drive/1j85QErM74VCSLADoGliM296q4GFUdnGM?usp=sharing It's safe to do 'Run All', though it will ask for Roboflow auth midway and HF token at the end.
The main outcome is the time printed underneath. You may uncomment visualizers to plot the images.
Caveats:
- Inference examples run on the CPU - I'm speaking with Pawel to check what I missed
- The first Ultralytics run is slower due to memory allocation. Run the cell again and the
ultralytics threads=1 batch=1case will run slightly faster.
Ah, force-pushing a revert to develop closes the PR.
Makes sense, and is fine. New changes: https://github.com/roboflow/supervision/pull/1239