toufunao
toufunao
**Describe the issue**: I want to run DARTS examples in multiple GPUs, so I wrapped the model with DDP and shared data with Distributedsampler. However, I found the final 2...
**Describe the issue**: When I tried to use NAS, it reported 'ImportError: Cannot use a path to identify something from __main__.' and 'ValueError: Pickle too large when trying to dump...
In hetero-LR settings, a role arbiter is needed. I would like to know which party should I assign this role. In the tutorial, some choose the guest to be the...
**Describe the issue**: I used 2 GPUs to train DARTS, but from the output, I find that I get 2 different results. And I used 'export_onnx', but I didn't get...
**问题描述 / Problem Description** 纯内网环境安装pycocotools依赖失败 **环境信息 / Environment Information** 操作系统:红帽商业版7.7 python:3.10.9 **附加信息 / Additional Information** 添加与问题相关的任何其他信息 / Add any other information related to the issue.
when i tried codellama-7b and codellama-34b to test code completion, all results were garbled code. facilities: OS: Red hat 4.8.5-36 GCC:4.8.5 32G V100 cuda:11.7 torch: 2.0.0 fairscale 0.4.13 sentencepiece: 0.1.99...
I used tensor_parallel to finetune qwen model with lora in tensor parallel way. However, it cannot save the model in the end. Any help can you provide? Thanks.
我使用了以下脚本进行训练,数据集大小约为33000条数据,per_device_batch_size=16,gradient_accumenlation_steps=32,epochs=3,4张GPU。 nproc_per_node=4 NPROC_PER_NODE=$nproc_per_node \ CUDA_VISIBLE_DEVICES=0,1,2,3 \ swift pt \ --model Qwen/Qwen2.5-7B \ --train_type full \ --dataset $CUSTOM_DATASET \ --torch_dtype bfloat16 \ --num_train_epochs 3 \ --per_device_train_batch_size 16 \ --per_device_eval_batch_size 1 \...