Shreyas S K
Shreyas S K
Hi @helpmefindaname I just tried to export the TARS NER model to ONNX and later I was able to quantise the model as well which brought down the artifact size...
Thanks for that! I was able to achieve 14x speed up compared to CPU. Now it's taking 25ms/sentence (GPU+ONNX) and 360 ms/ sentence (CPU) I also tried to inference on...
Hello Authors, Please give an alternate way to download the YFCC100M data.
Okay thanks. I trained this model based on the instructions provided in readme. I trained for 20 epochs under XE loss with metrics for test set {'Bleu_1': 0.5621, 'Bleu_2': 0.359,...
Hi @rasbt I tried to finetune distil bert base for IMDB dataset using Lora and Dora. Using various checkpoints i plotted the magnitude and directional difference. For LoRA it seems...
Hi, Thanks for providing the updated code. dora_output doesn't seem to capture dora_modification. Am I missing something here?
Sure. I'll try to incorporate these changes and see if it comes out as expected.
Here is the updated visualization for DoRA. Now it looks as expected. But one more observation, any layer for all checkpoints has very similar magnitude and directional differences (all checkpoints...
Yes, it's for the updated code. I'll run few more experiments with norm at dim=0 as well. Let me know if you there's any update from your end. Will be...
@rasbt I was able to reproduce the magnitude and directional update visualizations for DoRA and LoRA using Llama 2 7B, wrote a blog post recently on that. Link attached below....