Zhiwei Liang
Results
2
comments of
Zhiwei Liang
@zhyncs I implemented EAGLE in vllm and met the same probelm when the batch size increases. Here is a simple analysis (bs is batch size, k is proposal length, the...
> Thanks for the insights into tree-mask-based version. > As for EAGLE, I have an implementation ready for that as well. This implementation of Medusa does keep required abstractions in...