incubator-graphar Improve the performance of high-level graph iterators of the C++ library

Is your feature request related to a problem? Please describe. An important application case of GraphAr is to serve out-of-core graph processing scenarios. With the graph data saved as GAR files in the disk, GraphAr provides a set of reading interfaces to allow to load part of graph data into memory when needed, to conduct analytics. Since for out-of-core graph processing, disk I/O time usually dominates the overall execution time. It is critically important that the GraphAr C++ library perform efficiently for traversing vertices/edges through high-level graph iterators.

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Jan 11 '23 03:01 lixueclaire

Currently, creating or copying an iterator (especially the EdgeIter) is time-consuming, since it holds all necessary readers (considering sharing readers is not thread-safe). Is there any way to let the iterator to be more light-weight?

Apr 03 '23 09:04 lixueclaire

Currently, creating or copying an iterator (especially the EdgeIter) is time-consuming, since it holds all necessary readers (considering sharing readers is not thread-safe). Is there any way to let the iterator to be more light-weight?

This issue is discussed in GraphAr Weekly Community Meeting [Tuesday, April 4, 2023]: Perhaps we could improve the performance of EdgeIter by moving the readers to EdgesCollection, making the iterator more lightweight. Additionally, we could maintain a pool of readers and associate them with each iterator, so that when needed, the iterator can retrieve the appropriate reader from the pool. This approach would ensure thread-safety.

Apr 04 '23 12:04 lixueclaire