MOSS RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

在colab上按照示例代码运行： outputs = model.generate(**inputs, do_sample=True, temperature=0.7, top_p=0.8, repetition_penalty=1.1, max_new_tokens=128) 这段命令报错

Apr 21 '23 16:04 ccyhxg

单精度推理无法在 CPU 上执行，您需要将 model 与 input 全部转移到您的 GPU上，或者将 model 的 dtype 设置为 torch.float32。

Apr 21 '23 16:04 00INDEX

我有GPU，但是感觉他没有用？

Apr 21 '23 16:04 ccyhxg

感觉是colab的锅

Apr 21 '23 17:04 ccyhxg

您可能需要运行：

model = model.cuda()
inputs["input_ids"] = inputs["input_ids"].cuda()
inputs["attention_mask"] = inputs["attention_mask"].cuda()

Apr 21 '23 17:04 00INDEX

不用不用，是colab的锅，显示有GPU实际上没有：

Apr 21 '23 17:04 ccyhxg

RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Apr 21 '23 17:04 ccyhxg

还是不行，阔以示例代码跑一下colab咩，感觉跑不通~

Apr 21 '23 17:04 ccyhxg

@00INDEX 我们需要提供个colab的示例代码不

Apr 22 '23 05:04 piglaker

我有GPU，但是感觉他没有用？

base 模型下载到colab用了多长时间

Apr 22 '23 06:04 optimus1009