Extensions icon indicating copy to clipboard operation
Extensions copied to clipboard

Neo4j extension with LLM‑powered Cypher querying

Open dshire opened this issue 5 months ago • 0 comments

Summary

  • Introduces a Cognigy.AI extension that turns natural language into schema‑aware Cypher and queries Neo4j.
  • Includes a “Neo4j Query” node, connection schemas for Neo4j and OpenAI, build scripts, and docs.

Motivation

  • Enable non‑experts to explore Neo4j graphs using plain language.
  • Reduce Cypher authoring errors by constraining generation to the live schema.
  • Provide a reusable building block for knowledge‑graph powered assistants.

Features

  • Natural‑language to Cypher via OpenAI Chat Completions.
  • Live schema introspection using db.schema.nodeTypeProperties() and db.schema.visualization().
  • HTTPS Basic Auth to Neo4j; executes generated Cypher via HTTP API.
  • Configurable LLM model (default gpt-4.1-mini) and OpenAI API key.
  • Result storage in input or context under a configurable key.
  • Clear error path when no valid Cypher can be produced.

Implementation

  • src/module.ts: Registers node and connections with @cognigy/extension-tools.
  • src/nodes/neo4jQuery.ts: Main node; fetches schema, builds system prompt, calls OpenAI, runs Cypher, stores { cypher, result }.
  • src/connections/neo4jConnection.ts: Defines neo4jConnection (Host/Username/Password) and openAiApiKey.
  • Uses axios for both Neo4j and OpenAI HTTP calls.
  • package.json scripts: transpile, lint, build, zip → produces neo4j.tar.gz.
  • README.md: Comprehensive setup, usage, troubleshooting.

Configuration

  • Create Connections in Cognigy:
    • neo4jConnection: Host (HTTPS base), Username, Password.
    • openAiApiKey: openAiKey (must allow Chat Completions for the chosen model).
  • Node settings:
    • input (free‑form query), model, connections, storeLocation, inputKey/contextKey.

Usage

  • Build artifact with npm run build, import neo4j.tar.gz into Cognigy → Extensions.
  • Add “Neo4j Query” node; provide a natural‑language question (default uses input.text).
  • Read results at input. or context. as:
    • { cypher: "...", result: [ { field: value, ... }, ... ] }.

Security & Privacy

  • Sends schema metadata (labels, properties, relationship types) and the user query to OpenAI.
  • Requires HTTPS to Neo4j; use least‑privilege credentials.
  • Consider data governance before enabling in sensitive environments.

Limitations

  • Requires Neo4j HTTP API endpoints and listed procedures to be available.
  • Depends on OpenAI availability and model compatibility with Chat Completions.
  • No pagination/streaming; large results may need follow‑ups.
  • Returns the first message choice; no function/tool calling.

Test Plan

  • Build: npm run build → verify neo4j.tar.gz created.
  • Import into Cognigy; create both connections.
  • Run sample queries against a test graph; verify:
    • Cypher is valid and schema‑conformant.
    • Results are stored at the configured location/key.
    • “No valid Cypher possible” path throws and is catchable.
  • Negative tests: invalid credentials (401), unreachable host, invalid model.

Deployment

  • CI/CD optional: artifact is neo4j.tar.gz from npm run build.
  • Manual import via Cognigy UI for target environments.

Backward Compatibility

  • New extension; no breaking changes.

Risks & Mitigations

  • Incorrect Cypher generation → constrained prompt with explicit schema and rules.
  • Leaking schema externally → document implications; allow environments to disable if needed.
  • Endpoint assumptions → documented requirement for HTTPS and specific procedures.

Future Work

  • Provider abstraction (Azure OpenAI, other LLMs) and base URL overrides.
  • Caching schema to reduce overhead; schema refresh strategy.
  • Result pagination and typed mapping.
  • Additional safety (guardrails, prompt templates, validation).
  • Batch queries and parameterized prompts.

dshire avatar Sep 10 '25 07:09 dshire