Denis Kochetkov issues

Results 21 issues of


                                            Denis Kochetkov

Refactoring of Evaluation and adding of evaluate command

# ✨ Description Creates **Evaluator** abstraction so additional evaluators beyond **Loss** can be added. Adds an `evaluate` command that accepts the same training config and enables evaluation on the last...

[bug] Generate test occasionally fails

# 🐞 Describe the Bug The generated tokens from Fast-LLM occasionally differ completely from the Hugging Face (HF) counterpart. HF consistently generates the same output, so the issue likely lies...

bug

need update

Sandbox for Implementation of generate and integration of lm_eval (evaluation harness)

# ✨ Description This PR draft will be split in 3 PRs ## 🔍 Type of change Select all that apply: - [ ] 🐛 **Bug fix** (non-breaking change that...

Add data cleaning in fast-llm prepare, concept

# ✨ Description part of #112 Closes # ## 🔍 Type of change Select all that apply: - [ ] 🐛 **Bug fix** (non-breaking change that addresses a specific issue)...

[bug] 16 unit tests fail on main with custom install

# 🐞 Describe the Bug The following tests fail on main branch ``` FAILED tests/data/test_sampling.py::test_gpt_sample[full-0] - AssertionError: 3 != 6 FAILED tests/data/test_sampling.py::test_gpt_sample[full-32] - AssertionError: 1 != 6 FAILED tests/data/test_sampling.py::test_gpt_sample[full-88] -...

bug

need update

Run lm-eval-harness benchmarks during validation

# 🎯 **Goal (What & Why)** Enable Fast-LLM to run structured evaluations using [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness). This allows benchmarking Fast-LLM models across many standard tasks using the in-memory model during validation, leveraging...

enhancement

Reset Number of Steps in WandB to the Latest Saved Checkpoint or Implement Distinguishable Experiment Run Logging in WandB

## 🎯 **Goal (What & Why)** Currently, if we restart an experiment (e.g., due to a job being preempted), the iteration count in WandB will be higher than the actual...

enhancement

need update

Discussion about dataset preparation speed

# 🎯 **Description** With 1,250 bin files, dataset preparation takes around 6.5 minutes on 100 cores, 1TB RAM, and 8 GPUs. However, if I simply iterate through these files, read...

enhancement

need update

Move Config Validations (e.g., Dataset Usage vs. Definitions) to `_validate` for Dry Run Checks

# 🎯 **Goal (What & Why)** Most config validations are performed in `_validate`, but not all. Some checks, such as ensuring all used datasets are defined, are currently in `data`...

enhancement

need update

Online dataset mixing based on validation metrics

# 🎯 **Goal (What & Why)** Create a Blended Dataset which can use validation metrics to re-arrange its sampling probabilities of different subsets. This will allow to implement on the...

enhancement

need update