Daya Guo comments

Results 76 comments of


                                            Daya Guo

怎么实现deepseek-coder的lora-finetune呢

我们并没有尝试lora-ft，你可以尝试使用开源的一些框架，比如peft

How to extended window size during train step2?

please check our technical report. https://arxiv.org/pdf/2401.14196.pdf

请问finetune脚本是全参微调么，最少需要多少显存和内存。

1. 是全参数 2. 如果是33B的话，一般需要80G显存，但通过pp并行（速度会慢），40G显存也是可以的

使用fim后human eval分数很低？

看起来格式有点问题，请看这个readme https://github.com/deepseek-ai/DeepSeek-Coder#2-code-insertion

> "${preprefix}${prefix}${suffix}" "${prefix}${suffix}" > > 看起来格式有点问题，请看这个readme https://github.com/deepseek-ai/DeepSeek-Coder#2-code-insertion > > 看起来跟官方文档是一致的，他这种 fim格式问题出在哪里? "${prefix}${suffix}" 多了${preprefix}

Can you augment the model with whole repo?

The DeepSeek-Coder supports inputting an entire repository, but the length should not exceed 16k. The specific approach is to concatenate the files and write the file path at the beginning...

Encountering error: app.py has "cannot import name 'LlamaForCausalLM' from 'transformers'"

please update your transformers

was trying this on gpt-j-6b but landed into error on finetuning

Have you tried this? > If bitsandbytes doesn't work, [install it from source](https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md). Windows users can follow https://github.com/tloen/alpaca-lora/issues/17.

run app.py error

Fix done, please check again.

run app.py error

You need to use GPU. It's so slow if you use CPU