Romrawin Chumpu
Results
1
comments of
Romrawin Chumpu
I faced the same problem. The long inference time could be because of max_new_tokens was set to 1024. I think stopping_criteria was confusing when we're using Qwen family.