cold-compress
cold-compress copied to clipboard
add gist model generation utils to library
Other modifications worth mentioning:
- Changed
scripts/convert_hf_checkpoint.pyto support loading of finetuned Llama-3 models from safetensors state dict - Added finetuned configs to
model.py(Finetuned models use a vocab size of one token higher than Llama 3)
Thank you for the review! I will try to integrate the changes into this PR soon
we do have gist models with gist token after instruction and input and only instruction Uploaded to huggingface.
- We can train another model with gist tokens after input as well as instruction and then compare these to see which one works the best after evaluation