顾立辉 comments

Results 7 comments of


                                            顾立辉

Question about past_key_value modification

Hello, I encountered the same issue, but I now understand the rationale behind this approach. Define a custom KVCache class to enable preallocated GPU memory optimization. During attention computation, when...

thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value'

I meet the problem same and fix it, see [pull 32](https://github.com/FasterDecoding/REST/pull/32)

RunTime Error of metric 'scene', when decoding with BertLMHeadModel

I meet the same problem.

Naive brainstorm: accept length simulator

I also urgently need this feature. Is anyone currently developing it? If not, I'd like to try implementing it myself.

Naive brainstorm: accept length simulator

I've tried, implemented, and tested the feature. Here's my plan. ### Functional Requirements - The system should prioritize evaluating key metrics like accept length, enabling direct validation on datasets without...

Naive brainstorm: accept length simulator

https://github.com/sgl-project/SpecForge/pull/279 I prioritize supporting and testing QwenVL models.

EAGLE3 on Qwen2.5-VL / Qwen3-VL shows extremely low accept length (accept_len ≈ 1)

https://github.com/sgl-project/SpecForge/pull/279 Hi, it seems this issue is related to the SGLang-side integration. Could you help test this PR? You can evaluate the accept length of Qwen VL 2.5 **without relying...