LiuChaoXD
LiuChaoXD
I am training tsn-pytorch on kinetics400 data set. However, I get the accuracy 33% for train set while accuracy 5% for validate set. Did you complete the experiment for kinetics400?
> I believe llama cpp does not support long rope which is use by 128k variant. yeah, I tried to convert 128K version. `python convert.py ....` Raise `NotImplementedError: Unknown rope...
Hi, I try to implement the method. However, the key operation (SVD) is still not supported by MLX. Do you have any ideas?
Hi, I have already implemented Galore on Apple Silicon. It can reduce the memory usage. However, due to the SVD cannot be run on GPU, it's slow. I am wondering...