Jason Krone issues

Results 7 issues of


                                            Jason Krone

Modify StreamingDataset to support passing process_group as construct…

…or arg ## Description of changes: Modify StreamingDataset to support passing process_group as a constructor argument. Currently, StreamingDataset assumes it should use the default process group; however, for certain use...

Request to Add Prompt Details for Trivia QA Evaluation to Make Scores Reproducible

A few folks, including me, have been [trying and failing](https://github.com/EleutherAI/lm-evaluation-harness/issues/1292) to reproduce the llama Trivia QA scores using the [Eluther Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness). Specifically, for llama 3 8B I'm getting an...

NaN loss issues when I switch to the Transformer Engine TransformerLayer from pytorch layer

**Summary** I'm hitting a NaN loss issue when I use the TransformerLayer in place of a pytorch transformer layer I wrote. **Details** I'm using the nvcr.io/nvidia/pytorch:24.04-py3 docker container. I train...

How to Load OpenELM Pre-training Checkpoints using Hugging Face AutoModelForCausalLM ?

Hi there, First, really admire the work on OpenELM! Thank you for making your models and code available. Question regarding the [pre-training checkpoints linked here](https://github.com/apple/corenet/blob/main/projects/openelm/README-pretraining.md#pretraining-checkpoints-model-weights-and-logs): how can we convert these...

Jason Krone

Modify StreamingDataset to support passing process_group as construct…

Request to Add Prompt Details for Trivia QA Evaluation to Make Scores Reproducible

NaN loss issues when I switch to the Transformer Engine TransformerLayer from pytorch layer

How to Load OpenELM Pre-training Checkpoints using Hugging Face AutoModelForCausalLM ?

add how to use EFA to faq.rst

Error in Streaming Dataset Decompression in Distributed Setting

update mean reduction zloss to ignore labels == ignore_index