Didan Deng comments

Results 17 comments of


                                            Didan Deng

Multiple outputs AttributeError

I got similar error. My model output is a dictionary consisting of the features and the output classification. I got an AttributeError: 'dcit' object has no attribute 'size'.

测试集的EM和F1过低是什么问题

我的model在测试集的结果也很差,大约是: {"exact_match": 0.031065548306927617, "f1": 3.400328488853547}, 想问一下你找到解决这个问题的办法了吗?

[Feature]: Support Ulysses Sequence Parallelism for Diffusion Models

> I think we can support TP and CP for diffusion models at first by re-using the parallelism interfaces in vllm. Then we can verify whether the CP interfaces like...

[Feature]: Support Ulysses Sequence Parallelism for Diffusion Models

I agree that we can consider implementing TP+CP first, using the available interface in vllm. As for Ulysses-SP, I am worried that the existing CP interfaces are entangled with TP....

[RFC]: Diffusion Acceleration API design

> [@wtomin](https://github.com/wtomin) I already have a PR [#115](https://github.com/vllm-project/vllm-omni/pull/115) for attention interface. Do you think it's flexible enough to support all kinds of parallelism? I think it is flexible. We can...

[RFC]: Diffusion Acceleration API design

> Beyond the architecture, we also care about the interface of cache and attention Yes. I think we can follow the philosophy of x-dit and cache-dit, but it is critical...

[RFC]: Diffusion Acceleration API design

Yes. I agree that we can keep the the cache interface simple for TeaCache. No need to have cache context manager or unified block for now.

[RFC]: Diffusion Acceleration API design

> > A expected usgae example can be like > > > > export VLLM_ATTENTION_BACEND=FLASH_ATTN > > vllm server Qwen/QwenImage --omni --tp_size=2 --enable_usp --cp_size=2 --ring_size=2 > > Agreed > I...

[RFC]: Diffusion Acceleration API design

> Let me know when you start working on this, if possible I would like to chip in on the cache implementations :) I think you can raise a PR...

[RFC]: Proposal for Supporting Context Parallelism (RingAttention) and Parallelism Terminology Alignment in vLLM-Omni

I have an exisiting PR #189 that implemented ulysses sp on qwen-image. Can you also review this PR and give your suggestions on the interface design?