OptimalGradCheckpointing icon indicating copy to clipboard operation
OptimalGradCheckpointing copied to clipboard

Results 7 OptimalGradCheckpointing issues
Sort by recently updated
recently updated
newest added

As title. The program shows the peak memory usage and cut-offs, while I need help/hint as the title.

Hi, I am trying to reproduce the results. It works correctly with PyTorch 1.5, but with PyTorch 1.10 - `Parsing Computation Graph with torch.jit failed` and with manual parse_graph function...

``` ... File "/home/Foo/miniforge3/envs/pytor/lib/python3.6/site-packages/torch/nn/functional.py", line 1923, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED ``` Got this error when trying to run the following command the README.md: ```...

I use code like this ``` run_segment = optimal_grad_checkpointing(model, inp) run_segment, optimizer = apex.amp.initialize(run_segment, optimizer, opt_level="02", verbosity=0) ... output = run_segment(images) ``` and get the error ``` output = run_segment(images)...

I run the benchmark.pu with the following warnings. python benchmark.py --arch resnet18 --device cuda:0 Parsing Computation Graph with torch.jit failed, revert to manual parse_graph function

Hi, thank you so much for providing this code! Using the automatic computation graph parser, I was able to use the optimal gradient checkpoints during model training without writing much...

I know it's been a while since this repo was uploaded but was thinking of getting this to work with deepspeed. Did you by any chance try that? Internally deepspeed...