xFasterTransformer icon indicating copy to clipboard operation
xFasterTransformer copied to clipboard

performance issue for opt-1.3b with BS=1 BF16

Open bin1guo opened this issue 1 year ago • 1 comments

test opt 1.3 model on EMR platform with 52c. the performance is not right with BS=1. the gap between BS=1 and BS =2 is too big.

numactl -C 0-51 -m 0 ./run_benchmark.sh -m opt-1_3b -d bf16 -s 1 -bs 1 -in 128 -out 15 -i 10

the results BS=1 image

BS=2 image

bin1guo avatar Apr 23 '24 09:04 bin1guo

@bin1guo do we still need to benchmark OPT model? Suggest to run the llama model.

pujiang2018 avatar May 23 '24 02:05 pujiang2018