CTranslate2 icon indicating copy to clipboard operation
CTranslate2 copied to clipboard

support ChatGLM

Open nghuyong opened this issue 2 years ago • 6 comments

ChatGLM is a popular ChatGPT-like model in Chinese: https://github.com/THUDM/ChatGLM-6B

Could ct2 support ChatGLM, and speed up the inference. Thanks a lot.

nghuyong avatar Apr 29 '23 12:04 nghuyong

+1

BrightXiaoHan avatar May 04 '23 10:05 BrightXiaoHan

Hi @nghuyong, I create a repo fast-chatglm. I wrote a script to convert ChatGLM based on the llama converter, but the decoding result is poor. It should be because some operators did not match up. If you're interested, we can work together to take a look. image

BrightXiaoHan avatar May 17 '23 04:05 BrightXiaoHan

Hi, @guillaumekln. If you have time, could you please help take a look at where the problem is with the conversion script? We would greatly appreciate it. HF repo: https://huggingface.co/THUDM/chatglm-6b/tree/main

BrightXiaoHan avatar May 17 '23 04:05 BrightXiaoHan

coool, let me learn

nghuyong avatar May 17 '23 04:05 nghuyong

ChatGLM is not exactly a Llama model. There are several differences that are not (yet?) supported in CTranslate2:

  • position_encoding_2d is not implemented
  • the residual connection is different than other models: they add the layer norm output instead of the input and also apply a scale value

I'm also not sure about this PrefixEncoder, but it does not seem to be enabled for the current model.

guillaumekln avatar May 17 '23 08:05 guillaumekln

The PrefixEncoder can be ignored, as it is only necessary when loading additional p-tuning parameters. Will ctranslate2 support position_encoding_2d in the foreseeable future?

BrightXiaoHan avatar May 17 '23 08:05 BrightXiaoHan