Daya Guo
Daya Guo
我们并没有尝试lora-ft,你可以尝试使用开源的一些框架,比如peft
please check our technical report. https://arxiv.org/pdf/2401.14196.pdf
1. 是全参数 2. 如果是33B的话,一般需要80G显存,但通过pp并行(速度会慢),40G显存也是可以的
看起来格式有点问题,请看这个readme https://github.com/deepseek-ai/DeepSeek-Coder#2-code-insertion
> "${preprefix}${prefix}${suffix}" "${prefix}${suffix}" > > 看起来格式有点问题,请看这个readme https://github.com/deepseek-ai/DeepSeek-Coder#2-code-insertion > > 看起来跟官方文档是一致的,他这种 fim格式问题出在哪里? "${prefix}${suffix}" 多了${preprefix}
The DeepSeek-Coder supports inputting an entire repository, but the length should not exceed 16k. The specific approach is to concatenate the files and write the file path at the beginning...
please update your transformers
Have you tried this? > If bitsandbytes doesn't work, [install it from source](https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md). Windows users can follow https://github.com/tloen/alpaca-lora/issues/17.
Fix done, please check again.
You need to use GPU. It's so slow if you use CPU