Ryan Swope

Results 15 comments of Ryan Swope

Right, that was my thinking too, I just don't understand why this file is causing the error (i.e. say it's 20% larger than the other files I've used). I'm sort...

Thanks, I'll try that. Why would it be running OOM during a generate call though? The inputs are put on gpu prior to that call.

Just confirming that the issue persists even when building lightning from source

I've also changed `configure_sharded_model` to `configure_model` as referenced by @awaelchli in other issues, but that didn't change the error.

Wait, can you explain how this tests the failure in my issue?

I guess I'm confused because I don't see these ops when I run the min example I provided with `DEBUG=3`. Is there a different way I should be trying to...

This is the last kernel produced running with `DEBUG=4`: ```python UOp(UOps.SINK, dtypes.void, arg=KernelInfo(local_dims=2, upcasted=3, dont_use_locals=False), src=( UOp(UOps.STORE, dtypes.void, arg=None, src=( UOp(UOps.DEFINE_GLOBAL, dtypes.float.ptr(), arg=0, src=()), UOp(UOps.VIEW, dtypes.void, arg=ShapeTracker(views=(View(shape=(2, 223, 37, 4,...

Yeah so with NV=1 this is the error I see: ```python .venv/lib/python3.12/site-packages/tinygrad/runtime/support/hcq.py", line 233, in wait raise RuntimeError(f"Wait timeout: {timeout} ms! (the signal is not set to {value}, but {self.value})")...

It's the current version, 0.9.2

A quick update here: running with tinygrad 0.10.0 returns the same error with `CUDA=1` and `NV=1` on my min example above: ```python tinygrad/runtime/support/compiler_cuda.py", line 16, in nvrtc_check raise CompileError(f"Nvrtc Error...