Brian Williams issues

Repositories
Issues
Comments

Results 2 issues of


                                            Brian Williams

Support ScalingRotaryEmbedding

Added a scaling_factor to the rotary embedding calculation. This is for use with models like [DeepSeek](https://github.com/deepseek-ai/). DeepSeek uses LlamaLinearScalingRotaryEmbedding. The only difference is that the freqs in precompute_freqs_cis are divided...

CLA Signed

Allow small modes to work with convert_hf_checkpoint. Added TinyLLama to the model list

Small models in HF don't have pytorch_model.bin.index.json files, since they are unnecessary. I changed the convert_hf_checkpoint.py to allow a single pytorch_model.bin file as the model description. I added PY007/TinyLlama-1.1B-intermediate-step-480k-1T to...

CLA Signed