LYCnight

Results 13 comments of LYCnight

> I had the same issue before. This is probably caused by having the embeddings with different dimensions already stored inside the chroma db. I fixed that by removing the...

> @LYCnight 大佬 训练的时候预计要多大的显存 我的是 8 * 80G 全部跑满

> @LYCnight 按照大佬的环境,成功的开启了训练,但是为什么训练完后的文件超级大呢?存储空间不够了…… -rw-r--r-- 1 root root 4984147224 Sep 5 10:49 model-00001-of-00004.safetensors -rw-r--r-- 1 root root 4895071360 Sep 5 10:49 model-00002-of-00004.safetensors -rw-r--r-- 1 root root 4895071384 Sep 5 10:49 model-00003-of-00004.safetensors...

> 您可以试一下这里提到的方法: https://stackoverflow.com/questions/77606417/openai-api-request-with-proxy 好的,谢谢,我试试

类似于这样的跨页段落无法合并:

> 在deepspeed config里将stage3_prefetch_bucket_size设为15099494试试呢? 可以,但是会报新错误: RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288

# 官方人员检查一下 tokenizer 吧 我已经把官方的方法都试过了,现在我的情况是: - transformers==4.33.0 - pytorch==2.2.0 - /patch/modeling_chatglm.py 已替换 /root/AI4E/share/glm-4-9b/modeling_chatglm.py 但是运行的时候会报一个 KeyError: '',所以我认为是 tokenizer 的问题。 **官方人员检查一下 tokenizer 吧** from transformers import AutoTokenizer, AutoModelForCausalLM import torch path =...

# 附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息: KeyError: '' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in train() File "/root/AI4E/ljc/LongWriter/train/main.py", line 121,...

> > > 在deepspeed config里将stage3_prefetch_bucket_size设为15099494试试呢? > > > > > > 可以,但是会报新错误: RuntimeError: shape '[32768, -1, 1, 32, 2]' is invalid for input of size 524288 > > 我遇到了跟你一模一样的错误: Traceback...