WONG JH
Results
3
comments of
WONG JH
How did you train SVD?
The `xformer` attention implementation should behave exactly the same as the original OpenFlamingo one, hence it could potentially reduce memory usage (or improve the training/inference speed depending on your GPU)...
That is for debugging purposes. We should change it back to 21 when training.