Thank you for sharing the model and how to install everything. Worked flawlessly! 🚀

I was wondering if you could provide more guidance on prompts. Prompts that work well with other popular models like Claude3 or ChatGTP don't seem to work well. At least for me.

Some examples I tried.

# for readability alias prmpt='python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -temp 0 -p '

prmpt "Write an essay about the new iPhone" -n 300

Write an essay about the new iPhone 6s. The new iPhone 6s is a new phone that is a lot better than the old iPhone 6. The new iPhone 6s is a lot better than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old iPhone 6. The new iPhone 6s has a lot more features than the old iPhone 6 because it has a lot more features than the old

prmpt "Please write this politely: You stupid idiot, we have agreed to meet at 4PM."

Please write this politely: You stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM. You are a stupid idiot, we have agreed to meet at 4PM.

prmpt "Predict the end of this sentence: 'How are you doing? Fine,'"

Predict the end of this sentence: 'How are you doing? Fine,' or 'Not so good,' or 'I'm doing great!'" - "I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm doing great! I'm

Thank you in advance for any pointers.

Oct 18 '24 14:10 alanderex

It helps a bit if you increase the temperature parameter:

 python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "User: Explain to me how I can implement AWS Bedrock in my infrastructure using Terraform\nAnswer:" -n 200 -temp 0.77 -t 16

User: Explain to me how I can implement AWS Bedrock in my infrastructure using Terraform
Answer: Implementing the Bedrock in my infrastructure using Terraform
The following example is a Terraform implementation for AWS Bedrock.
- First, I will create a terraform plan to design my terraform
- Once I have the plan, I will use it to create the terraform
- Now, I will validate the terraform and make sure it has the correct inputs
- After validation, I will create a terraform and start the terraform
- And finally, I will see the output of the terraform
- To know more about Terraform, you can visit the Terraform website
- To know more about Terraform, you can visit the Terraform website
- To know more about Terraform, you can visit the Terraform website
- To know more about Terraform, you can visit the Terraform website
- To know more about Terraform, you can visit the Terraform website
- To know more about Terraform, you can visit the Terraform

I just read about the fine-tuning process (https://huggingface.co/blog/1_58_llm_extreme_quantization) and this fine-tuned version has an increased perplexity compared to the original llama 3 (worse quality output).

I don't think you'll get results comparable to Claude or ChatGPT, those are much bigger models :)

Oct 18 '24 19:10 paolorechia

Please feel free to try with the new model on HF, https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf

Apr 17 '25 07:04 sd983527

hi @sd983527 regarding "https://huggingface.co/microsoft/bitnet-b1.58-2B-4T" or "https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf" - i have a dataset (with input_text and class (0,1)) and want to fine tune bitnet model as a binary text classifier - is that possible and is there some code available to do this?

Apr 17 '25 18:04 Arnold1

Guidance on how to generate something useful

Some examples I tried.