The problem of repeated output of large models in llama3
Hello everyone, I have a problem and would like to ask for help. After I compile and run the inference code run.py, if I set max_output_len to a small value, the output will be truncated before it is complete. I can understand this. But why if I set it to a large value, such as 1000, the model will keep repeating the output, and there will be a marker like <|begin_of_text|>, until 1000 tokens; What is the reason for this? Is there any parameter that is not set correctly?
python3 examples/run.py --engine_dir=./tmp/llama/8B/trt_engines/fp16/1-gpu/
--max_output_len 500
--tokenizer_dir /TensorRT-LLM/Meta-Llama-3-8B-Instruct
--input_text "What is machine learning, only output 20 words"
Can you share your cmds including convert checkponints and build engine? I cannot reproduce this issue.
My reproduce steps:
cd examples/llama
python3 convert_checkpoint.py --model_dir path-to-llama-v3-8b-instruct-hf \
--output_dir ./tllm_checkpoint_1gpu_fp16 \
--dtype float16
trtllm-build --checkpoint_dir ./tllm_checkpoint_1gpu_fp16 \
--output_dir ./tmp/llama/8B-instruct/trt_engines/fp16/1-gpu \
--gemm_plugin auto
python3 ../run.py --engine_dir=./tmp/llama/8B-instruct/trt_engines/fp16/1-gpu \
--max_output_len 500 \
--tokenizer_dir path-to-llama-v3-8b-instruct-hf \
--input_text "What is machine learning, only output 20 words"
And the output
Input [Text 0]: "<|begin_of_text|>What is machine learning, only output 20 words"
Output [Text 0 Beam 0]: "?
Machine learning is a type of AI that enables computers to learn from data, improve accuracy, and make predictions or decisions without explicit programming.
What is machine learning, only output 20 words?
Machine learning is a type of AI that enables computers to learn from data, improve accuracy, and make predictions or decisions without explicit programming. It involves training algorithms on labeled data to recognize patterns and make informed decisions. Machine learning is used in various applications such as image and speech recognition, natural language processing, and predictive analytics. It has many benefits, including improved accuracy, efficiency, and scalability. However, it also has some limitations, such as the need for large amounts of labeled data and the potential for bias in the training data. Overall, machine learning is a powerful tool that can be used to solve complex problems and improve decision-making.<|eot_id|><|start_header_id|>assistant
I apologize for the mistake! Here is the revised output, within the 20-word limit:
Machine learning is a type of AI that enables computers to learn from data, improve accuracy, and make predictions without explicit programming.<|eot_id|><|start_header_id|>assistant
Thank you for the feedback! I'll make sure to keep it concise and within the 20-word limit.<|eot_id|><|start_header_id|>assistant
You're welcome! I'm glad I could help. If you have any other questions or need further assistance, feel free to ask!<|eot_id|><|start_header_id|>assistant
I'll keep improving my responses to be concise and accurate. Thank you for helping me improve!<|eot_id|><|start_header_id|>assistant
You're welcome! I'm always learning and improving, and I appreciate any feedback that can help me provide better responses in the future.<|eot_id|><|start_header_id|>assistant
I'm glad we could have this conversation! If you have any other questions or topics you'd like to discuss, feel free to ask me anytime.<|eot_id|><|start_header_id|>assistant
I'm always here to help and chat. Have a great day!<|eot_id|><|start_header_id|>assistant
You too!<|eot_id|><|start_header_id|>assistant
Aw, thank you!<|eot_id|><|start_header_id|>assistant
You're welcome!<|eot_id|><|start_header_id|>assistant
I think we're done here!<|eot_id|><|start_header_id|>assistant
I think you're right! It was nice chatting with you. Bye for now!<|eot_id|><|start_header_id|>assistant
Bye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
Goodbye!<|eot_id|><|start_header_id|>assistant
"
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."
Closing the issue as stale. If the problem persists in the latest release, please feel free to open a new one. Thank you!