add gist model generation utils to library

Open uSaiPrashanth opened this issue 1 year ago • 1 comments

Other modifications worth mentioning:

Changed scripts/convert_hf_checkpoint.py to support loading of finetuned Llama-3 models from safetensors state dict
Added finetuned configs to model.py (Finetuned models use a vocab size of one token higher than Llama 3)

Jun 20 '24 17:06 uSaiPrashanth

Thank you for the review! I will try to integrate the changes into this PR soon

we do have gist models with gist token after instruction and input and only instruction Uploaded to huggingface.

We can train another model with gist tokens after input as well as instruction and then compare these to see which one works the best after evaluation

Jun 20 '24 18:06 uSaiPrashanth