Romrawin Chumpu

Results 1 comments of Romrawin Chumpu

I faced the same problem. The long inference time could be because of max_new_tokens was set to 1024. I think stopping_criteria was confusing when we're using Qwen family.