Sam Blouir

Results 6 comments of Sam Blouir

Hi, Thanks for the reply. I hope you don't mind, but I found a way to use a rough FFT with Alpa and [posted code with instructions here](https://github.com/samblouir/fft_for_alpa), in case...

Similar issue here. The first few times I call some functions result in additional compiles. It seems to cause me to OOM sooner than expected. Does yours compile, always?

I apologize, I updated to the latest versions of JAX (0.4.18 -> 0.4.21) and FLAX (0.7.2? -> 0.8.0) and this seems to have resolved itself. I do not see this...

Same issue here, but it doesn't reproduce every time. Restarting vllm fixes it. It is happening somewhat sporadically. Interestingly, you can keep using vllm after this crash happens. I'm on...

Edit: I have spoken too soon about the workaround. There seems to be an issue with the buffer_dict when using bfloat16. There are uuids missing from it, even though they...

> Try following this? > > ``` > docker run --rm -it --gpus all --network="host" --shm-size=900gb nvcr.io/nvidia/pytorch:23.12-py3 > pip install flash-attn==2.5.1.post1 > ``` This works for me, but only with...