multimsg timeout!!
Hi Leo Can you please describe what circumstances produce the multimsg timeout!! messages? I have configured publisher like this and am sure the volume produced is less than 10000/2hrs:
nchan_publisher;
nchan_message_buffer_length 10000;
nchan_message_timeout 2h;
nchan_redis_idle_channel_cache_timeout 0;
but getting these messages:
2017/03/02 05:00:21 [error] 7#7: MEMSTORE:00: multimsg 00007FCE81ACB140 timeout!!
thanks
This was a bug in an older version of Nchan that I had been monitoring with that error message. It's unrelated to any of the above settings. It has been fixed in version 1.1.0.
1.1.0 (Jan. 4 2017)
feature: websocket subscribers now receive "application/octet-stream" messages
in binary frame rather than text
fix: publisher request variables not always passed to nchan_publisher_upstream_request
feature: Redis storage 'backup' mode strictly for data persistence
fix: possible lingering subscriber if connection is closed just before subscribing
--> fix: possible memory leak when using multiplexed channels <--- this is the fix
security fix: subscribing with If-Modified-Since and without If-None-Match headers
crashed the Nginx worker (thanks @supertong)
security fix: sending an empty message to multipart/mixed subscriber crashed Nginx worker
fix: publisher & subscriber response codes were logged incorrectly
fix: websocket subscriber memory leak on upstream authentication failure
fix: possible crash after reloading when using several instances of Nchan with Redis
feature: nchan_access_control_allow_origin can accept nginx variables
feature: Redis key namespaces
feature: all size configurations can now parse decimal values
fix: off-by-one subscriber count when using multiplexed channels
feature: accounting and dynamic limits for channel groups
Hi. I am running the latest release v1.1.1. (I can try 1.1.2)
I see, then this is something different. Can you show me your Nginx config? (edit out whatever you need to keep private)
Hi leo. Here is a slightly simplified version. I call an openresty subscription directive which either includes the last_event_id or not depending on client side business logic. Openresty then routes the request to the correct subscription (which you can see changes which starting point based on the first_message parameter).
location ~ /internal/publish/(?<tenant>[a-z0-9-]*)/(?<collection>\w+)$ {
internal;
nchan_publisher;
nchan_message_buffer_length 10000;
nchan_message_timeout 2h;
nchan_redis_idle_channel_cache_timeout 0;
nchan_channel_id "$tenant/$collection";
nchan_use_redis on;
}
location ~ /internal/subscribe/new/(?<tenant>[a-z0-9-]*)/user$ {
internal;
nchan_subscriber eventsource;
nchan_channel_id "$tenant/collection0" "$tenant/collection1" "$tenant/collection2";
nchan_use_redis on;
nchan_subscriber_first_message 0;
nchan_subscriber_timeout 0;
}
location ~ /internal/subscribe/resume/(?<tenant>[a-z0-9-]*)/user$ {
internal;
nchan_subscriber eventsource;
nchan_channel_id "$tenant/collection0" "$tenant/collection1" "$tenant/collection2";
nchan_use_redis on;
nchan_subscriber_first_message "oldest";
nchan_subscriber_last_message_id $arg_last_event_id;
nchan_subscriber_timeout 0;
}
I am getting memory issues with this config.
Thanks for the config. I'll be looking into this. What do you mean by "memory issues"? Also, how often are you getting the "multimsg timeout!" error?
Sorry, the memory issue being the MEMSTORE issues.
Seems to arise after about 24hrs. ill keep an eye on timings for you.
Hi @slact ,
I'm getting that error as well. I'm running nchan on a sandbox environment, only 3 person accessing it. NCHAN returns 504 for every /sub sometimes, but not always. NCHAN is not delivering the messages and in the log file i just see 2017/03/10 14:35:55 [error] 12#0: MEMSTORE:03: multimsg 00007FEC8ED31A70 timeout!!.
The link to the nchan's error log: https://gist.github.com/michaeltcoelho/4cec53164b7d2118bb2d22a6df1bf4c2
My nginx-nchan configurations files: https://gist.github.com/michaeltcoelho/327257df15413ec41508986ad76a1724
The link to the nginx's error log: https://gist.github.com/michaeltcoelho/b8541d9392ba12ce3fa956ebf1d16839
That's another error i got: 2017/03/10 16:55:23 [error] 6#0: SUB:LONGPOLL:abort! unknown error
I'm running the latest version 1.1.1
Here is my setup: https://github.com/michaeltcoelho/docker-nchan/blob/master/Dockerfile
Hi, anything about that?
@michaeltcoelho : The issue you're experiencing is different, you have a worker crash which triggers those "multimsg timeout!!" errors. The real problem is
nginx: worker process: /nchan-1.1.1/src/util/nchan_output.c:375: msgtag_to_strptr: Assertion `t[i] >= -2' failed.
I'm looking into both, but I've opened a separate ticket for yours: #322.
I have the same problem, seeing:
2018/04/16 19:19:23 [error] 15#15: MEMSTORE:00: multimsg callback #3 for 0000557563B946C0 received after expiring at 1523906363 status 1 msg 0000000000000000
2018/04/16 19:19:23 [error] 15#15: MEMSTORE:00: multimsg 0000557563B946C0 timeout!!
2018/04/16 19:40:39 [error] 13#13: MEMSTORE:00: multimsg 000055BF26783D60 timeout!!
2018/04/16 19:40:39 [error] 13#13: MEMSTORE:00: multimsg callback #3 for 000055BF26783D60 received after expiring at 1523907639 status 1 msg 0000000000000000
....
from time time to time.
We don't have a lot of traffic. Here's current status just after multimsg errors:
total published messages: 3029
stored messages: 136
shared memory used: 108K
shared memory limit: 131072K
channels: 107
subscribers: 12
redis pending commands: 0
redis connected servers: 3
total interprocess alerts received: 628053
interprocess alerts in transit: 0
interprocess queued alerts: 0
total interprocess send delay: 0
total interprocess receive delay: 0
nchan version: 1.1.14
@indrekj these have been fixed in master with d3a65572046562cd0d92af455ff2ec7eaf466cea
I'm still having this or a similar issue. After a few hours, with max a few messages per second and 4-5 subscribers over EventSource. Is there anything I can help with to track this down? Theese MEMSTORE timout messages and then a lot of those " open socket #6 left in connection 35" type ones.
ker process: /build/nginx-KTLRnK/nginx-1.18.0/debian/modules/nchan/src/store/memory/memstore.c:2644: msg_release: Assertion `msg->refcount > 0' failed.
2021/10/07 14:36:26 [alert] 1200#1200: worker process 1201 exited on signal 6 (core dumped)
2021/10/07 14:36:49 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B3179F6C0 timeout!!
2021/10/07 14:36:49 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B3179D6E0 timeout!!
2021/10/07 14:36:50 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B317A4EC0 timeout!!
2021/10/07 14:36:50 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B317B7B90 timeout!!
2021/10/07 14:36:51 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B317B6C80 timeout!!
2021/10/07 14:37:55 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B317679B0 timeout!!
2021/10/07 14:49:03 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B31875260 timeout!!
2021/10/07 15:05:27 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B318F8830 timeout!!
2021/10/07 15:16:04 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B319EB1F0 timeout!!
2021/10/07 15:18:54 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B31A5A290 timeout!!
2021/10/07 15:22:53 [error] 1202#1202: MEMSTORE:01: multimsg 0000556B31A7C080 timeout!!
2021/10/07 15:40:42 [error] 1202#1202: MEMSTORE:01: tried adding WAITING chanhead 0000556B31874B10 /1qHBjWwe7 to chanhead_gc. why?
2021/10/07 15:47:17 [error] 1202#1202: MEMSTORE:01: tried adding WAITING chanhead 0000556B317BAB30 /TIJIjVicf to chanhead_gc. why?
2021/10/07 15:52:24 [alert] 1202#1202: *18705 open socket #6 left in connection 35
2021/10/07 15:52:24 [alert] 1202#1202: *18706 open socket #107 left in connection 36
2021/10/07 15:52:24 [alert] 1202#1202: *18959 open socket #222 left in connection 37