Sarthak J. Shetty
Sarthak J. Shetty
> **Describe the bug** Using this loss method with Trainer from transformers library (Pytorch) and YOLOv8 (Pytorch) leads to crash training shortly after start due to cuda out of memory....
> Thank you for bringing up this issue. I took the time to delve deeper into the situation using your provided sample code. From my findings, there doesn't appear to...
> Hi @SarthakJShetty-path, I used the same code you shared, the graph looks like:  > > What's your PyTorch and MONAI version? Interesting! Here are the versions: ```bash (venv)...
> I've conducted tests using a new 1.3.0 image and unfortunately, I've been unable to reproduce your reported issue. Could I kindly recommend attempting the same process in a fresh...
> @SarthakJShetty-path any updates? Sorry about the delay with this. I haven't gotten around to pulling the Docker image and trying, but several members on our team are reporting this...
> @SarthakJShetty-path I did switch to #4205 ShapeLoss, but don't sure that it brings the expected results. Can you try running this piece of code and posting the results? ```python...
@KumoLiu Looks like even @dimka11 has the same error. The GPU memory seems to strictly increase even on Google Colab, with values very similar to what I posted above.
> @KumoLiu I was able to reproduce the issue using the projectmonai/monai:1.3.0 container. > > If you run `pip uninstall cupy-cuda12x` before executing the script, the memory leak will occur....
> Hi @johnzielke, thanks for the detailed report. Your findings are insightful and indeed point to an interaction between CuPy, garbage collection, and memory deallocation which could be the root...
Thank you! There are no other clarifications at this time. I'd be happy to see the docs be updated to be a bit more clearer that's all. -Sarthak