Improve the performance of high-level graph iterators of the C++ library
Is your feature request related to a problem? Please describe. An important application case of GraphAr is to serve out-of-core graph processing scenarios. With the graph data saved as GAR files in the disk, GraphAr provides a set of reading interfaces to allow to load part of graph data into memory when needed, to conduct analytics. Since for out-of-core graph processing, disk I/O time usually dominates the overall execution time. It is critically important that the GraphAr C++ library perform efficiently for traversing vertices/edges through high-level graph iterators.
Describe the solution you'd like A clear and concise description of what you want to happen.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
Currently, creating or copying an iterator (especially the EdgeIter) is time-consuming, since it holds all necessary readers (considering sharing readers is not thread-safe). Is there any way to let the iterator to be more light-weight?
Currently, creating or copying an iterator (especially the EdgeIter) is time-consuming, since it holds all necessary readers (considering sharing readers is not thread-safe). Is there any way to let the iterator to be more light-weight?
This issue is discussed in GraphAr Weekly Community Meeting [Tuesday, April 4, 2023]: Perhaps we could improve the performance of EdgeIter by moving the readers to EdgesCollection, making the iterator more lightweight. Additionally, we could maintain a pool of readers and associate them with each iterator, so that when needed, the iterator can retrieve the appropriate reader from the pool. This approach would ensure thread-safety.