ggg-s
ggg-s
hi when will you upload your code? very excited to it!
@Lihui-Gu hi, great work!May I take a look at your PR?
yep !I tried it, and it runs properly now. Thank you.
[rank1]:W0830 09:42:25.606000 56327 site-packages/torch/_dynamo/variables/tensor.py:1047] [7/0] Graph break: from user code at: [rank1]:W0830 09:42:25.606000 56327 site-packages/torch/_dynamo/variables/tensor.py:1047] [7/0] File "/opt/miniconda/envs/sglang/lib/python3.12/site-packages/specforge/core/eagle3.py", line 777, in _compute_metric_acc [rank1]:W0830 09:42:25.606000 56327 site-packages/torch/_dynamo/variables/tensor.py:1047] [7/0] ).sum().item() / (loss_mask.sum().item()...
> @ggg-s I suspect the torch compile of flex attention has issues with L20. Can you use --attention-backend sdpa four your training? Okay. I will try it.
> can you try the following change? If this works on L20. I will raise a PR to fix the kernel options. > > ``` > kernel_options = { >...
@KerwinKai Is it working properly now?