Michael Zhang

Results 1 issues of Michael Zhang

Added parallel code for chatglm-6B. Due to the small number of parameters, the inference speed is not as fast as single card loading, but it can be referenced in GLM...