"Compile” doesn't work on GPU
pytorch version: 2.3.1+cu121
I've tried to use compile=True in load function, it is super slow to generate results, and I didn't see any GPU utilizition with nvidia-smi like below (a liitle memory ocuupied like 1GB):
after I commented out the code block below, GPU utilization turns normal, and result is generated much much faster:
I cant understand why the torch.compile function didn't work well. Does anyone know why?
The compile only shows its effect when you run a generating process continuously with the same input shape. If you only run it once, it will slow down the generation.
This issue was closed because it has been inactive for 15 days since being marked as stale.