dgl
dgl copied to clipboard
Graphbolt: Enable separate stream for CUDA memory copy and computation
🔨Work Item
IMPORTANT:
- This template is only for dev team to track project progress. For feature request or bug report, please use the corresponding issue templates.
- DO NOT create a new work item if the purpose is to fix an existing issue or feature request. We will directly use the issue in the project tracker.
Project tracker: https://github.com/orgs/dmlc/projects/2
Description
Depending work items or issues
We will probably handle this automatically when we finalize the design of the pipelining and executor logic for the sampling stage.
This is already implemented in the dataloader with the overlap_feature_fetch switch.
@frozenbugs is there more to be done for this issue? We already support feature copy overlap when features are pinned.