Juncheng Gu comments

Results 11 comments of


                                            Juncheng Gu

INFINISWAP OVER SOFTROCE

Infiniswap fails at the stage of creating RDMA QP when using SoftRoCE. Maybe the kernel-level ib interfaces haven't adapted to SoftRoCE.

Question about cluster.empty_infra

@Rivendile Thanks for trying out Tiresias. The message should mean that the cluster can not satisfy the resource requirement (especially, GPU) of the top job in the queue. Would you...

Fix torch.clamp issue #237

@CryptoSalamander, I would prefer to use the second option. I faced the same issue when using torch 2.0, and the `item()` method in the first option will lead torch.dynamo to...

[Bug]: NixlConnector should not skip short do_remote_prefill requests in connector metadata

@robertgshaw2-redhat, @wwl2755

[Bug]: NixlConnector should not skip short do_remote_prefill requests in connector metadata

> Thanks [@juncgu](https://github.com/juncgu), I think this should now be fixed by [#18631](https://github.com/vllm-project/vllm/pull/18631). Hi @njhill, IMHO, #18631 is not relevant to the case here (when prompt < block_size and no remote...

[P/D] Support CPU Transfer in NixlConnector

@richardliaw, @robertgshaw2-redhat

[P/D] Support CPU Transfer in NixlConnector

@yaochengji, pushed accuracy and edge case tests. But they can only be tested offline.

[P/D] Support CPU Transfer in NixlConnector

Thanks for the reviews, @njhill. > I'm unsure about the design of some of the abstractions. In particular: > > * Are there cases where the intermediate device would be...

[P/D] Support CPU Transfer in NixlConnector

appreciate the insightful feedbacks, @njhill. > Thanks @juncgu! > > > I agree. Having async and layer-wise h2d/d2h copies without blocking the next forward pass is very important. It will...

[P/D] Support CPU Transfer in NixlConnector

> Thanks @juncgu, have added a couple more responses. Thanks, @njhill. As you suggested, I added `do_save_to_host` and `do_load_to_device` attributes in `ReqMeta/NixlConnectorMetadata` to specify the h2d/d2h copy operations when host...