HyperGraphRAG icon indicating copy to clipboard operation
HyperGraphRAG copied to clipboard

[Question] Storage costs and path finding

Open devjmc opened this issue 5 months ago • 1 comments

Thank you for your paper and the publishing of the code. Some questions on your paper:

  1. What is the storage overhead of Hypergraph RAG? Entities embedding increases storage. Can you clarify whether related entities (with its embeddings) from the ingest of several document chunks, are actually "fused" together in a single entity? Have you evaluated impact of quantization or reduced dimensionality on the entities embeddings?

  2. Multihop question answering capability of Hypergraph RAG seems limited. The Generation Augmentation phase only consider neighboors, instead of trying to find a path across hyperedges that may link all related entities and questions of the question. Have you evaluated a path searching algorithm for the generation augmentation?

Thank you!

devjmc avatar Sep 16 '25 02:09 devjmc

Thank you very much for your thoughtful questions and interest in our work. Please find my responses below: 1. On storage overhead and entity embeddings: Related entities are not fused together, and we only merge exactly identical entities. Our rationale is that entity fusion would incur additional construction overhead, while not affecting retrieval, since the system is designed to retrieve all related entities regardless. Regarding quantization or dimensionality reduction, we have not systematically explored these directions. Instead, our experiments showed that using higher-dimensional, stronger embedding models consistently led to better performance. 2. On multi-hop question answering: This is an excellent question. The main focus of the paper is on proposing and investigating n-ary relational knowledge representations and hypergraph structures for building GraphRAG. In this paradigm, a single hyperedge can already capture many facts that would otherwise require multi-hop reasoning in binary relations. Moreover, our design encourages one-hop retrieval followed by reasoning and analysis performed by large language models, which we believe are already capable of deep multi-step thinking. That said, we have been actively working in this direction. In particular, we have introduced Graph-R1 (https://github.com/LHRLAB/Graph-R1), which extends HyperGraphRAG with multi-turn interactive reinforcement learning. This framework is better positioned to address the multi-hop reasoning challenges you mentioned further.

LHRLAB avatar Sep 16 '25 02:09 LHRLAB