SpecForge issues

[Bug] Eagle3 training for gpt-oss-120b fails with OOM

6

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. -...

gopalsarda

vlm target model don't have set_aux_hidden_states_layers

after this pr https://github.com/sgl-project/SpecForge/pull/290 fixed VLM's lack of support for set_aux_hidden_states_layers, but after this pr https://github.com/sgl-project/SpecForge/pull/308 This part of the code has been removed, which will cause bugs in VLM.

C3236455482

Support check pointing for data generated

## Motivation ## Modifications ## Related Issues ## Accuracy Test ## Benchmark & Profiling ## Checklist - [ ] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit). -...

yubofredwang

[Question] Why configs/qwen3-30B-A3B-eagle3.json draft_vocab_size=32000

1

Why is the draft_vocab_size in configs/qwen3-30B-A3B-eagle3.json 32,000 instead of the vocab_size 151,936 in qwen3-30B-A3B?

huang3eng

Throughput degradation on Qwen3-30B-A3B with EAGLE3

I got a throughput degrade when I try to EAGLE3 to speed up Qwen3-30B-A3B (in H100*2). My draft model is download from: [https://huggingface.co/zhuyksir/EAGLE3-Qwen3-30B-A3B-DenseHead](url) The command I using to benchmark: ```...

Zzsf11

@Abigbigbig This looks like a different issue from this PR. Let's move to a different issue. I can point you the fix

6

@Abigbigbig This looks like a different issue from this PR. Let's move to a different issue. I can point you the fix _Originally posted by @yubofredwang in https://github.com/sgl-project/SpecForge/issues/314#issuecomment-3588281081_ Thank you...

Abigbigbig

draft train

Can the 48G A6000 GPU be used for draft training of the qwen2.5-vl-7b model? Will there be OOM?

Abigbigbig

[Bug] Missing bench_model_speedup.py

1

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. -...

lesser-fullness-G

feat: added low VRAM flash attention backend

7

## Motivation The two existing attention backends both exhibit inefficiencies which inhibit the training experience. - `sdpa` backend materializes the full `bsz x num_heads x q_len x kv_len` attention score...

timmy-feng

[Feature] Qwen3 VL eagle3 support

12

## Motivation ** **Draft PR. This is currently WIP** ** Add eagle3 support for qwen3_vl and qwen3_vl_moe models. ## Modifications ## Related Issues ## Accuracy Test ## Benchmark & Profiling...

dcw02

SpecForge
SpecForge copied to clipboard

Metadata

[Bug] Eagle3 training for gpt-oss-120b fails with OOM

vlm target model don't have set_aux_hidden_states_layers

Support check pointing for data generated

[Question] Why configs/qwen3-30B-A3B-eagle3.json draft_vocab_size=32000

Throughput degradation on Qwen3-30B-A3B with EAGLE3

@Abigbigbig This looks like a different issue from this PR. Let's move to a different issue. I can point you the fix

draft train

[Bug] Missing bench_model_speedup.py

feat: added low VRAM flash attention backend

[Feature] Qwen3 VL eagle3 support

← Metadata

Owner

Metadata

SpecForge SpecForge copied to clipboard

Metadata

← Metadata

Owner

Metadata

SpecForge
SpecForge copied to clipboard