Mikhail Grankin issues

Results 6 issues of


                                            Mikhail Grankin

Torrent download and Zstd compression

I repackaged dataset with zstd and uploaded it to academictorrents.com. http://academictorrents.com/details/dba20c45d4d6fa6453a4e99d2f8a4817893cfb94 Also it is temporarily availible as a direct link here http://fma.mine.toys/fma/checksums http://fma.mine.toys/fma/fma_metadata.tar.zst http://fma.mine.toys/fma/fma_small.tar.zst http://fma.mine.toys/fma/fma_medium.tar.zst http://fma.mine.toys/fma/fma_large.tar.zst http://fma.mine.toys/fma/fma_full.tar.zst Zstd is way...

fix for bigger models (774M)

training/validation loss difference

I'm doing one epoch fine-tuning on my dataset. I observe that the training loss is consistently lower than validation loss. Given that I use default LORA dropout 0.05 and train...

The XL Model and the latest DeepSpeed

Time flies swiftly in the world of ML. Sparse models have lost their popularity, and the code for them is no longer maintained. The older version of Triton isn't compatible...

list index out of range

During launch of `python main_denoiser.py` ``` Traceback (most recent call last): File "main_denoiser.py", line 142, in train(epoch) File "main_denoiser.py", line 52, in train for iteration, batch in enumerate(training_data_loader, 1): File...

Request to Open "Russian Pile" Dataset for Public Access

Dear Yandex Team, I hope this message finds you well. I am writing to express my admiration for your work on the YaLM-100B model, which has demonstrated exceptional performance in...