krator

Results 6 issues of krator

单机,8卡,NPROC_PER_NODE=4 ,中断后续训,报下面的错误 ```bash Traceback (most recent call last): File "~/swift/swift/cli/sft.py", line 5, in sft_main() File "~/swift/swift/utils/run_utils.py", line 31, in x_main result = llm_x(args, **kwargs) File "~/swift/swift/llm/sft.py", line 228, in llm_sft...

pending

用swift微调qwen1.5-14B时,初始运行很正常,但是断点续训后,报错了,报错信息如下 ```bash [INFO:swift] Setting model.config.use_cache: False [WARNING:modelscope] Reusing dataset dataset_builder (/home/devops/.cache/modelscope/hub/datasets/AI-ModelScope/hh_rlhf_cn/master/data_files) [INFO:modelscope] Generating dataset dataset_builder (/home/devops/.cache/modelscope/hub/datasets/AI-ModelScope/hh_rlhf_cn/master/data_files) [INFO:modelscope] Reusing cached meta-data file: /home/devops/.cache/modelscope/hub/datasets/AI-ModelScope/hh_rlhf_cn/master/data_files/042c234b69de5779cdd75934ad9c9a94 Traceback (most recent call last): File "/data/homedir/work/swift-play/swift/swift/cli/sft.py", line...

bug
postpone

### Describe your problem Minio is good enough for individual use but in business scenario we want a more flexible file storage service like s3 or any other. This may...

Feature

**Describe the bug** 自从支持lisa以后,8个v100可以全参数微调32b了 我一直是用main分支源码安装使用swift的,最近几天发现同样入参的微调命令,以前可以运行,现在会报显存不足的问题 ``` CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 NPROC_PER_NODE=1 \ swift sft --sft_type full \ --model_type qwen1half-32b-chat \ --dataset ms-bench \ --train_dataset_sample 5000 \ --self_cognition_sample 1000 \ --logging_steps 5 \ --max_length...

通义千问1.5 14B模型几天前更新了 tokenizer.json 文件,今天我跑运行的时候,所有权重文件都一起下载了一遍 这个是feature还是bug?有没有选项可以控制这个行为呢? 谢谢!

Stale

我在加入自定义数据集训练的时候,发现会偶发性地爆显存,后来手动去除了较长的样本后,才正常,但是我是加了--max_length参数的,似乎没有生效 --max_length 参数是否不对自定义数据集生效?如果确实不生效,能不能增加这个选项呢?