IAN

Results 11 comments of IAN

The weight of this model is saved as external_data, but the problem is still exists。

Does this code need to differentiate whether to enable dpattention. ![Image](https://github.com/user-attachments/assets/f90c18fb-22f1-4c89-8998-1d7a7223ae27)

CMD to reproduce this error python3 -m sglang.launch_server --model /mnt/DeepSeek-R1 --tp 8 --trust-remote-code --enable-dp-attention ``` import json json_schema = json.dumps( { "type": "object", "properties": { "name": {"type": "string", "pattern": "^[\\w]+$"},...

> [@hcyz33](https://github.com/hcyz33) I don't think this is due to constraint decoding. Could you check it for several times? Also, [@FrankLeeeee](https://github.com/FrankLeeeee) could you take a look? Thanks! ![Image](https://github.com/user-attachments/assets/871342e9-e7d9-4d1b-ab85-ed96e7278adb) I added a...

I found that the reason of hang is that the sampling_info_done will never receive the signal. It will wait here until timeout. ![Image](https://github.com/user-attachments/assets/6b7fd6ed-8151-4edf-adc6-12de8e2fa509) It seems the root cause is that...

I add an event set at idle batch as below. The hanging issue has disappeared. However, it seems that the structured output of some requests is not taking effect. Further...

I forgot to update_regex_vocab_mask. I added it before sync. The results are all correct now! I Think that i have fixed it. ![Image](https://github.com/user-attachments/assets/d82f3475-f575-4fb9-8360-ebd79585c561) ![Image](https://github.com/user-attachments/assets/f9967fff-5ff9-4f79-8b50-dc16f385e599)

> Hi [@hcyz33](https://github.com/hcyz33) , do you want to create a PR to fix this? Yes,I will.

> ## Motivation > * Support double sparsity (post-training sparse attention) for long context inference in SGLang > * See [paper](https://arxiv.org/pdf/2408.07092) > > ## Modifications > * Add triton implementation...

I hit the same bug when enable NEXTN ![Image](https://github.com/user-attachments/assets/b6db144e-b941-4979-a2e1-bc8b3f35cb1a)