Results 6 issues of Mikhail Grankin

I repackaged dataset with zstd and uploaded it to academictorrents.com. http://academictorrents.com/details/dba20c45d4d6fa6453a4e99d2f8a4817893cfb94 Also it is temporarily availible as a direct link here http://fma.mine.toys/fma/checksums http://fma.mine.toys/fma/fma_metadata.tar.zst http://fma.mine.toys/fma/fma_small.tar.zst http://fma.mine.toys/fma/fma_medium.tar.zst http://fma.mine.toys/fma/fma_large.tar.zst http://fma.mine.toys/fma/fma_full.tar.zst Zstd is way...

I'm doing one epoch fine-tuning on my dataset. I observe that the training loss is consistently lower than validation loss. Given that I use default LORA dropout 0.05 and train...

Time flies swiftly in the world of ML. Sparse models have lost their popularity, and the code for them is no longer maintained. The older version of Triton isn't compatible...

During launch of `python main_denoiser.py` ``` Traceback (most recent call last): File "main_denoiser.py", line 142, in train(epoch) File "main_denoiser.py", line 52, in train for iteration, batch in enumerate(training_data_loader, 1): File...

Dear Yandex Team, I hope this message finds you well. I am writing to express my admiration for your work on the YaLM-100B model, which has demonstrated exceptional performance in...