FileLoader with Postgres (pgvector) does not remove old embeddings — `documents` table grows indefinitely with each new upload
Describe the bug
When using Flowise in a Docker environment with a Document Store and a FileLoader, connected to Postgres with pgvector, embeddings from old files remain in the database even after new files are uploaded.
Flowise always appends new embeddings to the documents table without removing older entries that are no longer needed, causing the table to grow indefinitely. Flowise becomes noticeably slower — both the "Process" step and the "Upsert" step take significantly longer, making the system sluggish over time
To Reproduce
Steps to Reproduce:
-
Run Flowise in a Docker container with a Postgres (pgvector) database connected as Document Store.
-
Create a FileLoader node and upload one or more files.
-
Check the
documentstable in the Postgres database — embeddings are stored as expected. -
Remove the old files from the FileLoader input and upload completely new files.
-
Check the
documentstable again — the old embeddings are still present, and new embeddings have been appended instead of removing outdated ones.
Expected behavior
old embeddings should be deleted from the documents table, because they are no longer part of the active Document Store
Screenshots
Flow
No response
Use Method
None
Flowise Version
3.0.5
Operating System
Windows
Browser
Chrome
Additional context
No response
Possible duplicate of #3570 and #4152 .
This is an issue when Postgres is used as Vector Store. Not sure if using Postgres as RecordManager affects the issue or not.
There's a PR #4808 That seems to be a fix for this. I have temporarily "fixed" the issue by setting up and using Supabase as the Vector Store.
@HenryHengZJ This issue has been around since Nov 2024, might be worth a look if the discussion in #3570 and the PR #4808 are of any help/are a good solution.
this matches no.7 – memory breaks across sessions from our 16-problem list, except here it’s in the persistence layer — embeddings are never fully removed, so the documents table just keeps growing and old vectors still get surfaced.
mit licensed project, cold start to ~600 stars in 60 days. if you want the cleanup + retention guard patterns we use to fix this class, let me know and i can share.
Is the Record Manager not cleaning the old data?
@HenryHengZJ Yes, unfortunately, older records are retained when using Postgres DB, regardless of whether you use the Record Manager or not. This leads to incorrect (or outdated) retrieval results. This makes Postgres DB unusable for Flowise
We tested the fix from PR #4808 using the fork here:
https://github.com/Mewyii/Flowise/tree/feature/postgresDatabaseFix
With this branch applied, the issue is resolved for us, duplicates are no longer created.
@HenryHengZJ Is there any update on getting this PR reviewed and merged into the main branch?