Ryan Swope comments

Results 15 comments of


                                            Ryan Swope

Cuda OOM during generate() call on 4 GPUs

Right, that was my thinking too, I just don't understand why this file is causing the error (i.e. say it's 20% larger than the other files I've used). I'm sort...

Cuda OOM during generate() call on 4 GPUs

Thanks, I'll try that. Why would it be running OOM during a generate call though? The inputs are put on gpu prior to that call.

CUDA OOM with DeepSpeed ZeRO Stage 3 Offload

Just confirming that the issue persists even when building lightning from source

CUDA OOM with DeepSpeed ZeRO Stage 3 Offload

I've also changed `configure_sharded_model` to `configure_model` as referenced by @awaelchli in other issues, but that didn't change the error.

add lin failure 58

Wait, can you explain how this tests the failure in my issue?

add lin failure 58

I guess I'm confused because I don't see these ops when I run the min example I provided with `DEBUG=3`. Is there a different way I should be trying to...

HCQ Read Timeout on NVIDIA, No error on Metal

This is the last kernel produced running with `DEBUG=4`: ```python UOp(UOps.SINK, dtypes.void, arg=KernelInfo(local_dims=2, upcasted=3, dont_use_locals=False), src=( UOp(UOps.STORE, dtypes.void, arg=None, src=( UOp(UOps.DEFINE_GLOBAL, dtypes.float.ptr(), arg=0, src=()), UOp(UOps.VIEW, dtypes.void, arg=ShapeTracker(views=(View(shape=(2, 223, 37, 4,...