problem about ascend npu 910A

Open BruceWang1996 opened this issue 1 year ago • 2 comments

Whether this project is only adapted to the Ascend NPU 910B chip? I am trying to run fastchat vicuna-7b-v1.5 on Ascend NPU 910A chip, but the inference speed very very very slow, almost 5min/answer.

Apr 22 '24 05:04 BruceWang1996

+1，same problem

Apr 30 '24 02:04 ImmNaruto

add some code ，it works well，good luck

1.add import

import torch_npu 
from torch_npu.contrib import transfer_to_npu

2.add jit in main function

if __name__ == "__main__": 
    use_jit_compile = os.getenv('JIT_COMPILE', 'False').lower() in ['true', '1']     
    torch.npu.set_compile_mode(jit_compile=use_jit_compile) 
main(args)

May 07 '24 06:05 zhou-wjjw