Michael Zhang
Results
1
issues of
Michael Zhang
Added parallel code for chatglm-6B. Due to the small number of parameters, the inference speed is not as fast as single card loading, but it can be referenced in GLM...