SpecForge icon indicating copy to clipboard operation
SpecForge copied to clipboard

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Results 56 SpecForge issues
Sort by recently updated
recently updated
newest added

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. -...

after this pr https://github.com/sgl-project/SpecForge/pull/290 fixed VLM's lack of support for set_aux_hidden_states_layers, but after this pr https://github.com/sgl-project/SpecForge/pull/308 This part of the code has been removed, which will cause bugs in VLM.

## Motivation ## Modifications ## Related Issues ## Accuracy Test ## Benchmark & Profiling ## Checklist - [ ] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit). -...

Why is the draft_vocab_size in configs/qwen3-30B-A3B-eagle3.json 32,000 instead of the vocab_size 151,936 in qwen3-30B-A3B?

I got a throughput degrade when I try to EAGLE3 to speed up Qwen3-30B-A3B (in H100*2). My draft model is download from: [https://huggingface.co/zhuyksir/EAGLE3-Qwen3-30B-A3B-DenseHead](url) The command I using to benchmark: ```...

@Abigbigbig This looks like a different issue from this PR. Let's move to a different issue. I can point you the fix _Originally posted by @yubofredwang in https://github.com/sgl-project/SpecForge/issues/314#issuecomment-3588281081_ Thank you...

Can the 48G A6000 GPU be used for draft training of the qwen2.5-vl-7b model? Will there be OOM?

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. -...

## Motivation The two existing attention backends both exhibit inefficiencies which inhibit the training experience. - `sdpa` backend materializes the full `bsz x num_heads x q_len x kv_len` attention score...

## Motivation ** **Draft PR. This is currently WIP** ** Add eagle3 support for qwen3_vl and qwen3_vl_moe models. ## Modifications ## Related Issues ## Accuracy Test ## Benchmark & Profiling...