cuda-python icon indicating copy to clipboard operation
cuda-python copied to clipboard

Design of graph support - Phase 1

Open leofang opened this issue 1 year ago • 3 comments

leofang avatar Sep 19 '24 15:09 leofang

A few pointers to consider when we design this:

  • https://github.com/pytorch/pytorch/pull/130386
  • https://github.com/pytorch/pytorch/pull/137318
  • https://github.com/cupy/cupy/pull/8615
  • https://github.com/numba/numba/pull/4182

leofang avatar Nov 12 '24 18:11 leofang

Discussed internally. With all things considered will take a multi-phase approach to iteratively enhance the CUDA graph coverage. Below is the phase-1 design considerations:

  • Only cover stream capture (no explicit graph construction)
  • Exclude memory allocation/deallocation steps, and assume when entering the capturing context all needed memory are already allocated by the user
  • Exclude host callback operations
  • Basic coverage for conditional nodes
  • The resulting graph should be replay-able, meaning
    • user objects' lifetimes are properly managed
    • ...

leofang avatar Jan 23 '25 05:01 leofang

Design is being wrapped up with a prototype (#455). Moving this to beta 4.

leofang avatar Mar 24 '25 01:03 leofang