jiaxingli issues

Results 8 issues of


                                            jiaxingli

feat: unitest

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

[Feature] 兼容torch_npu

### Describe the feature 目前看到代码中强依赖于torch.cuda，希望可以更改接口并兼容npu卡，即兼容torch_npu。 ### Will you implement it? - [ ] I would like to implement this feature and create a PR!

feat(zero bubble): update zbh1

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

update test loss

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

feat(checkpoint): TP recomputation communication optimization

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

feat(checkpoint): support universal checkpoint

特别声明：本功能模块技术路线基于veScale checkpoint和ByteCheckpoint实现。 veScale：https://github.com/volcengine/veScale/tree/main ByteCheckpoint：https://arxiv.org/abs/2407.20143 # 通用检查点系统通用ckpt系统独立于原版ckpt系统，相互不兼容。 ## 基本功能 Dense 模型下 model ckpt 和 optimizer ckpt 的各种并行配置的动态加载支持: - [x] GPU world size - [x] tensor parallel - [x] pipeline parallel...

[Installation]: undefined symbol: _ZNK3c1011StorageImpl27throw_data_ptr_access_errorEv

### Your current environment python 3.10, H100 ```text python -c 'import vllm' INFO 03-18 16:16:30 __init__.py:183] Automatically detected platform cuda. Traceback (most recent call last): File "", line 1, in...

installation