Fast-LLM issues

[Prototype] LoRA

14

# ✨ Description Fix: #149 LoRA support * [x] Basic LoRA wrapper * [x] Basic LoRA Config * [x] Add LoRA support in attention * [x] Add LoRA support in...

jlamypoirier

[Prototype] Option to configure layers independently

2

# ✨ Description Fixes: #154, #155. This PR proposes a simple way to obtain layer-dependent configuration by leveraging Fast-LLM's existing config update mechanism. It works by providing a "default" layer...

jlamypoirier

[bug] 16 unit tests fail on main with custom install

6

# 🐞 Describe the Bug The following tests fail on main branch ``` FAILED tests/data/test_sampling.py::test_gpt_sample[full-0] - AssertionError: 3 != 6 FAILED tests/data/test_sampling.py::test_gpt_sample[full-32] - AssertionError: 1 != 6 FAILED tests/data/test_sampling.py::test_gpt_sample[full-88] -...

bigximik

bug

need update

Concat prompt and completion cols for tokenizing

# ✨ Description Migrated from #248; this PR allows a dataset with prompt and completion specifically and in general any pair of text columns (eg: question and answer) to be...

nitsanluke

Combine GPTHuggingfaceDatasetConfig input sources into `source_schema`

# ✨ Description This PR creates a common interface for all `GPTHuggingfaceDatasetConfig` input columns via the new `source_schema` variable. Beyond the variable `filed` we require additional keys to preprocess and...

nitsanluke

Support chat template in `prepare`

# 🎯 **Goal (What & Why)** Support chat template during dataset preparation to make it easier for SFT, DPO and other instruction finetuning methods. This takes away from the user...

sohamparikh

enhancement

[feat] FP8 training

4

# 🧐 Problem Description FP8 training can significantly improve training throughput by reducing memory requirements and improving computational efficiency. However, challenges remain in integrating FP8 across all components of the...

tscholak

enhancement

need update

Nemotron-H support

# 🎯 **Goal (What & Why)** Add support for training [Nemotron-H models](https://research.nvidia.com/labs/adlr/nemotronh/). Nemotron-H is a family of hybrid SSM-Transformer models (8B, 47B, 56B) trained by NVIDIA in FP8 on 20T...

tscholak

enhancement

need update

[feat] Generalize dynamic config classes

5

# 🧐 Problem Description #104 introduced a mechanism tor selecting config classes dynamically. This can be made useful elsewhere, especially for user-made plugins. # 💡 Proposed Solution * Generalize the...

jlamypoirier

enhancement

Simplify config validation

2

# 🎯 **Goal (What & Why)** See discussion in #211. The config validation scheme currently makes little distinction between validation, mutation, and derivation, which can make things difficult to follow....

jlamypoirier

enhancement

help wanted

Fast-LLM
Fast-LLM copied to clipboard

Metadata

[Prototype] LoRA

[Prototype] Option to configure layers independently

[bug] 16 unit tests fail on main with custom install

Concat prompt and completion cols for tokenizing

Combine GPTHuggingfaceDatasetConfig input sources into `source_schema`

Support chat template in `prepare`

[feat] FP8 training

Nemotron-H support

[feat] Generalize dynamic config classes

Simplify config validation

← Metadata

Owner

Metadata

Fast-LLM Fast-LLM copied to clipboard

Metadata

← Metadata

Owner

Metadata

Fast-LLM
Fast-LLM copied to clipboard