[BUG] BullMQ removeOnComplete/removeOnFail count setting not working properly for prediction queue
BullMQ removeOnComplete/removeOnFail count setting not working properly for prediction queue
Describe the bug
When running Flowise in queue mode with Redis, the REMOVE_ON_COUNT environment variable is not correctly limiting the number of completed jobs in the prediction queue. Despite setting REMOVE_ON_COUNT=300, the number of completed jobs (ZCARD bull:flowise-queue-prediction:completed) continues to grow well beyond 300, eventually causing excessive Redis memory usage that requires manual intervention.
To Reproduce
- Configure Flowise with queue mode enabled
- Set up Redis as the message broker
- Set
REMOVE_ON_COUNT=300in the environment - Run a high volume of prediction jobs through a chatflow
- Check Redis with
ZCARD bull:flowise-queue-prediction:completed - Observe that the completed job count grows significantly beyond 300
Expected behavior
The bull:flowise-queue-prediction:completed sorted set should be automatically trimmed to maintain approximately 300 entries as specified by the REMOVE_ON_COUNT environment variable. Job data associated with removed entries should also be cleaned up.
Flow
Any standard chatflow that generates prediction jobs will demonstrate the issue when run at volume.
Setup
- Installation: Docker containers (both app and workers)
- Flowise Version: 2.2.7-patch.1
- OS: Linux (Azure App Service)
- Redis: Azure Cache for Redis (v6.0.14)
Additional context
-
Confirmed the
REMOVE_ON_COUNTenvironment variable is correctly set to 300 inside the container. - The issue specifically affects the
flowise-queue-predictionqueue. - Code examination shows that in
BaseQueue.ts, theaddJobmethod should correctly read theREMOVE_ON_COUNTenvironment variable and apply it to bothremoveOnCompleteandremoveOnFailoptions as{ count: 300 }. - The problem persists even after restarting all Flowise containers.
- Currently, the only workaround is to periodically run
FLUSHDBon the Redis instance, which is not sustainable in production. - Memory usage in Redis grows continuously without intervention.
Question: Could this be related to the BullMQ version's compatibility with Redis 6.0.14, since BullMQ recommends 6.2.0+? Or is there a bug in how the options are processed specifically for the prediction queue?
UPDATE:
I just saw that I was looking at the newest, code, AND NOT THE ONE of the flowise version I was using! We can see that removeOnComplete was not part of the last release, but will be part of this one. This is also true for the environment variables REMOVE_ON_COUNT and REMOVE_ON_AGE.
Here's the git diff:
SO; this bug seems to have been already adressed for the next release, which is great news ! @HenryHengZJ, would you have an ETA for the next release?
yep, ETA next week
yep, ETA next week
@HenryHengZJ amazing, thanks! Regarding issue #2186 is there a plan to integrate it as well? I think it would make sense to ship both fixes together as they both relate to queue mode
yep, ETA next week
@HenryHengZJ amazing, thanks! Regarding issue #2186 is there a plan to integrate it as well? I think it would make sense to ship both fixes together as they both relate to queue mode
that requires some refractoring on Redis, will continue on that thread, closing this for now