EmilioZhao

Results 2 issues of EmilioZhao

**1. Is your feature request related to a problem? Please describe.**     Sure. In autonomous driving, we heavily used GPU to accelerate computation. One typical scenario is: self-driving car...

feature request
? - Needs Triage
inactive-90d
inactive-30d

## 🚀 Feature Please add Medusa decoding in mlc-llm in C++, we urgently needed it to speedup LLM decoding on mobile device. refers to: https://github.com/FasterDecoding/Medusa/tree/main Medusa adds extra "heads" to...

feature request