a3413209
a3413209
Hello, I used the model you provided to measure the time consumption, iphone12 results vary greatly, while iphone13 is basically the same
By process: 1、Install TVM Unity and compile successfully 2、Get the model weight 3、Build the model to the library exist python3 build.py --model vicuna-v1-7b --type float16 --target iphone --quantization-mode int3 --quantization-sym...
The accuracy of cpu only, cpu and gpu, and ALL are different, and the result of cpu only is accurate.