How are the hypergraph-related JSON files generated?
Hi, thank you for sharing this great project!
I’m currently studying the codebase and noticed that the script relies on the following JSON files:
kv_store_text_chunks.json
kv_store_entities.json
kv_store_hyperedges.json
Could you kindly clarify:
How are these JSON files generated from the original dataset?
Is there any existing script or code reference for preprocessing the raw data into these formats?
Any guidance or pointers would be greatly appreciated. Thanks in advance!
Hi! This step constructs these files.
Thanks for your reply!
After reading through script_build.py, I believe this script is responsible for reading the following preprocessed files:
kv_store_text_chunks.json
kv_store_entities.json
kv_store_hyperedges.json
It then performs embedding and FAISS indexing on the contents. However, it doesn't seem to contain logic for generating these files from the original dataset.
The script reads these files and inserts newly constructed knowledge into them. However, if these files are not present at the beginning, the script will automatically construct them from the original dataset.
Hi! This step constructs these files.