OpenCRE icon indicating copy to clipboard operation
OpenCRE copied to clipboard

Implemented optimized Neo4j queries and setup

Open Hardik301002 opened this issue 10 months ago • 5 comments

Title: Improve Map Analysis Performance with Optimized Neo4j Queries

Description:

Summary: This PR implements several improvements to make the Map Analysis feature faster and more resource-efficient. It includes optimized Cypher queries, indexing for faster lookups, caching mechanisms to avoid redundant computations, and usage of APOC procedures for parallel processing.

Key Changes:

  1. Added Indexes on Standard(name) and Framework(name) to reduce NodeByLabelScan operations.

  2. Refined Cypher Queries using WITH and LIMIT clauses for smaller, more efficient batches.

  3. Implemented Caching for frequently accessed paths, minimizing redundant path calculations.

  4. Utilized APOC for timeboxed queries and parallel batch processing.

Why: Addresses Issue #587, ensuring faster response times and lower resource usage.

Testing:

Verified performance gains via PROFILE in Neo4j Browser. Measured query execution time before and after applying indexes and caching.

Next Steps:

Further refine queries if new performance bottlenecks arise. Update documentation as needed.

How to Test:

  1. Pull these changes and install any required dependencies (e.g., python-dotenv, neo4j).

  2. Start Neo4j and run setup_database.py to create nodes/indexes.

  3. Run query_database.py and observe improved execution times using PROFILE or logs.

Hardik301002 avatar Mar 16 '25 07:03 Hardik301002

Hi @northdpole, could you please take a look at this PR and provide feedback?

Hardik301002 avatar Mar 16 '25 07:03 Hardik301002

@northdpole Could you please review this? I have fixed the linting errors.

Hardik301002 avatar Mar 23 '25 07:03 Hardik301002

have you tried running gap analysis with the improved queries? the code looks good but have you managed to benchmark if there's an improvement?

you can do so by importing only 2 standards and then calculating gap analysis for these two and measuring performance before/after your changes

northdpole avatar Mar 26 '25 16:03 northdpole

@northdpole Could you please review this?

Hardik301002 avatar Apr 06 '25 04:04 Hardik301002

@northdpole Could you please review this?

i am, can you please answer my question? have you run this code and measured the performance improvement or is this just based on the neo4j UI?

northdpole avatar Apr 26 '25 18:04 northdpole