Kuntai Du comments

Results 89 comments of


                                            Kuntai Du

How to perform ROI based encoding?

Good answer! Just wonder is it possible to assign different RoI map for different frames? If so, could you please give me some hints on how should I modify the...

About DDS stream B encode.

In ```dds_utils.py```, the ```crop_images``` function contains these two lines: ``` if region.fid not in cropped_images: cropped_images[region.fid] = np.zeros_like(cached_image[1]) ``` So the background of the high-quality image will be (R,G,B) =...

Unrecognized option 'qp'

It is most likely due to the ffmpeg version is too low. Conda will implicitly install an old version of ffmpeg when installing opencv. So you may need to install...

[Misc]: Implement CPU/GPU swapping in BlockManagerV2

I can also help! Let me take a look at the example code and `BlockSpaceManager` first.

[Usage]: Cannot run the starter code in tutorial

Maybe start from a very small model (e.g. facebook/opt-125m) and see if it will run? And also, run `nvidia-smi` first to make sure the GPU you are using is not...

[Performance]: benchmarking vllm copy kernel and pytorch index copy

Willing to help with benchmarking PyTorch copy kernel v.s. the current hand-written one

[Performance]: benchmarking vllm copy kernel and pytorch index copy

I benchmarked the performance of vLLM KV cache copy kernel v.s. pytorch index copy. Seems that vLLM kernel is faster. vLLM kernel: 4.4006272315979 ms Torch kernel: 11.914390563964844 ms The benchmarking...

Bandwidth Estimator

We intentionally keep the bandwidth estimator outside our implementation to make our program easy for others to get started.

[RFC]: Disaggregated prefilling and KV cache transfer roadmap

> Will the logic for model upgrades and instance service discovery be introduced in XpYd? Model upgrades --- not in the scope of disaggregated prefill roadmap for now. But this...

[RFC]: Disaggregated prefilling and KV cache transfer roadmap

> Glad to see the progress of supporting P/D disaggreation feature. > > 1. Will this RFC support a central scheduler to determine the best prefill and decode instances to...