Kuntai Du
Kuntai Du
Good answer! Just wonder is it possible to assign different RoI map for different frames? If so, could you please give me some hints on how should I modify the...
In ```dds_utils.py```, the ```crop_images``` function contains these two lines: ``` if region.fid not in cropped_images: cropped_images[region.fid] = np.zeros_like(cached_image[1]) ``` So the background of the high-quality image will be (R,G,B) =...
It is most likely due to the ffmpeg version is too low. Conda will implicitly install an old version of ffmpeg when installing opencv. So you may need to install...
I can also help! Let me take a look at the example code and `BlockSpaceManager` first.
Maybe start from a very small model (e.g. facebook/opt-125m) and see if it will run? And also, run `nvidia-smi` first to make sure the GPU you are using is not...
Willing to help with benchmarking PyTorch copy kernel v.s. the current hand-written one
I benchmarked the performance of vLLM KV cache copy kernel v.s. pytorch index copy. Seems that vLLM kernel is faster. vLLM kernel: 4.4006272315979 ms Torch kernel: 11.914390563964844 ms The benchmarking...
We intentionally keep the bandwidth estimator outside our implementation to make our program easy for others to get started.
> Will the logic for model upgrades and instance service discovery be introduced in XpYd? Model upgrades --- not in the scope of disaggregated prefill roadmap for now. But this...
> Glad to see the progress of supporting P/D disaggreation feature. > > 1. Will this RFC support a central scheduler to determine the best prefill and decode instances to...