Fast-LLM
Fast-LLM copied to clipboard
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
# ✨ Description Please provide a brief summary of the changes, relevant motivation, and context. Include any related issue numbers or links to discussions, and explain why this change is...
# ✨ Description Please provide a brief summary of the changes, relevant motivation, and context. Include any related issue numbers or links to discussions, and explain why this change is...
# ✨ Description Creates **Evaluator** abstraction so additional evaluators beyond **Loss** can be added. Adds an `evaluate` command that accepts the same training config and enables evaluation on the last...
# ✨ Description This PR provides a converter for Diffusion models based on Llama (and Dream). It complements the mask-diffusion training PR #238 and needs to merge after. ## 🔍...
# 🐞 **Describe the Bug** When converting models config options not included in the architecture config are not imported from the Hugging Face model's `config.json`. This creates an unexpected and...
# ✨ Description This pr improves some minor things in SSM/Hybrid classes, adds functionality for loading and exporting Apriel SSM and hybrid SSM models (adds corresponding modeling.py classes), adds `embeddings_lr_scale`...
# ✨ Description This draft PR addresses #242 by introducing a flexible, modular configuration system for hybrid model architectures. TODOs: - [ ] add more testing to make sure legacy...
# ✨ Description A simplified version of #273, where resources are allocated statically for each workers. This works fine, with some big caveats: * Multi-gpu tests and spawned processes run...
# ✨ Description Adjust the rotary embeddings, peft and normalization layers to use the new dynamic classes. Do some cleanup and refactoring for rotary embeddings. Add an option to disable...
# ✨ Description Allows running tests in parallel and using all the available gpus so we can run lots of tests fast. Pytest-xdist is already relatively good, but puts everything...