ZHU QIHAO comments

Results 23 comments of


                                            ZHU QIHAO

用法：给参考文献中的会议年份数字加粗

好的十分感谢

Construction of the FIM training data

You can see "https://github.com/EleutherAI/gpt-neox/blob/FIM-clean/megatron/data/gpt2_dataset.py#L339". We use the character-level disruption.

预训练细节（fim）

切割了 context（4096）之后在document级别做的fim

Instruct - Code Completion

sft model still can work with " const printHelloWorld = () => {} ", donot add ### instruction.

我现在想使用ollama加载这个模型,但是我无法转换tokenizer.model

You can directly load the model from GGUF. The GGUF models are here: https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF

请问demo中的system prompt怎么使用？

由于jinjia2版本的问题，原来的tokenizer_config.json里面的代码有些问题，现在已经修复并上传，可以下载重试一下这个case

Which pkl is used for tokenization?

char_voc.pkl and nl_voc.pkl are used to tokenize code for code readers.

RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

Maybe "cuda out of memory"

RuntimeError: CUDA out of memory

If you want to change the batch size, you need to change the number in the dict "args". If you want to use multiple GPUs, you need to modify "model...

RuntimeError: CUDA out of memory

How many GPUs do you use? And the batch size?