vkc1vk
vkc1vk
Hi I was wondering Medusa will be supported with full tree attention or the Top-1 version currently available in vLLM? Thanks. cc: @zhyncs @merrymercy
Referring to this -> https://github.com/NVIDIA/TensorRT-LLM/blob/v0.13.0/examples/redrafter/README.md
@byshiue could you confirm if support will be added? Thanks!
I'm seeing this issue in v13, is there an ETA on the fix? Thanks @Barry-Delaney