Todd Mostak issues

Results 13 issues of


                                            Todd Mostak

[FE-15602] Map Chart Popup Image Display Support [DRAFT]

# Merge Checklist ## :wrench: Issue(s) fixed: - [ ] Author referenced issue(s) fixed by this PR: - [ ] Fixes #0 ## :smoking: Smoke Test - [ ] Works...

[FEATURE] Support for minimum learning rate

### 🚀 Feature Support for specification of a minimum learning rate ### Motivation Often in the research literature minimum learning rates are set when fine-tuning a model using a cosine...

type/feature

[FEATURE] Support for grid search over hyperparameters

### 🚀 Feature Add the capability to the UI to kick off a grid-search over a set of hyperparameters (with specified search increments for continuous parameters, and specified attributes for...

type/feature

Don't issue time or hashtag queries when those charts are minimized

As a user I should only pay a performance hit on charts I am actively looking at. In particular, the hashtag query is currently expensive as the Twitter dataset gets...

enhancement

Implement multi-token prediction option for models

Per the [recent paper from Meta](https://arxiv.org/abs/2404.19737), it appears that models that predict multiple future tokens can exhibit significantly greater sample efficiency than models trained only on next-token prediction, plus the...

[BUG] HuggingFace export does not preserve bfloat16 weights but converts to float16 silently when using CPU for upload

### 🐛 Bug Native bfloat16 model fine-tuned with bfloat16 gets pushed to HuggingFace as float16 ### To Reproduce 1. Choose a HF model like [Llama-3](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with weights natively as bfloat16...

type/bug

[FEATURE] Option to plot train/eval plots with epoch instead of step on x-axis

### 🚀 Feature Allow epoch to be optionally used as the x-axis of training/eval charts for easier comparison between runs with different amounts of training pairs. ### Motivation I'm often...

type/feature

13B V2 model planned?

Thank you for all your work on this project, it's really great to have a fully OSS Llama backbone. I was excited to see the V2 version of the models...

[BUG] Error during LoRA-merge in HF upload for Llama 3.1 70B model

### 🐛 Bug Today when attempting to upload a LoRA-trained Llama 3.1 70B model (first time I've trained Llama 3.1), I hit the following during the eLoRA merge. Note I...

type/bug

[BUG] Memory allocation left resident in GPU(s) after model upload to HuggingFace

### 🐛 Bug When uploading a model to HuggingFace and using the `cpu_shard` setting, and I believe any available GPUs, allocations are left resident in GPU memory after upload. This...

type/bug