libwebsockets icon indicating copy to clipboard operation
libwebsockets copied to clipboard

Subject: libwebsockets v4.3.3 – handling partial writes from lws_write()

Open pppaulpeter opened this issue 7 months ago • 5 comments

Hi Andy,

I’m using libwebsockets v4.3.3 in a simple server that, inside my LWS_CALLBACK_SERVER_WRITEABLE handler, calls

c
int m = lws_write(wsi, buf, len, LWS_WRITE_BINARY);
if (m < (int)len) {
    lwsl_err("ERROR %d writing to ws socket\n", m);
    return -1;
}

Sometimes lws_write() returns a positive value or zero m < len, indicating only part of my buffer got sent; your examples tend to treat any m < len as “error” (returning –1) and drop the rest.

Is it expected that lws_write() can return a partial-write count (0 < m < len)?

If I want to guarantee the entire buf[0..len) is delivered, what’s the recommended pattern?

Should I queue the unsent tail (buf + m, len – m) and retry in the next LWS_CALLBACK_SERVER_WRITEABLE?

Or is there a flag I’m missing to force a blocking/complete write?

Thanks for any guidance!

pppaulpeter avatar Jun 13 '25 07:06 pppaulpeter

You don't have to do anything. lws will allocate a side buffer with the unsent part copied to it automatically, and swallow WRITEABLE callbacks internally to use them to work through the side buffer. When it's done, it will free the side buffer and your callback will start seeing WRITEABLE again so it can go on.

Partial sends are a POSIX thing... you get informed you can write more via POLLOUT type indication on the socket, but you are not told how much until after you did the write() / send(), so you can't always "write the right amount". It seems if you get a POLLOUT on a tcp socket, there is always at least one MTU of space, so for applications where you're not sending much, you can more or less eliminate them by only trying to write ~ 1 MTU (typ 1400 bytes).

lws-team avatar Jun 13 '25 07:06 lws-team

You don't have to do anything. lws will allocate a side buffer with the unsent part copied to it automatically, and swallow WRITEABLE callbacks internally to use them to work through the side buffer. When it's done, it will free the side buffer and your callback will start seeing WRITEABLE again so it can go on.

Partial sends are a POSIX thing... you get informed you can write more via POLLOUT type indication on the socket, but you are not told how much until after you did the write() / send(), so you can't always "write the right amount". It seems if you get a POLLOUT on a tcp socket, there is always at least one MTU of space, so for applications where you're not sending much, you can more or less eliminate them by only trying to write ~ 1 MTU (typ 1400 bytes).

thanks for your quick replay, so the following demo code is ok ? is that ok write for loop calling lws_write in the case LWS_CALLBACK_SERVER_WRITEABLE as shown in the following demo code?

case LWS_CALLBACK_SERVER_WRITEABLE:
		if (pthread_mutex_trylock(&user_data->playback_data_mutex) == 0)
		{
			//schedule the playback data sending 
			if((!user_data->bCloseConn)&& (user_data->playback_data_ready)&&(user_data->send_len>0) && (user_data->playback_buff))
			{
				UINT32 total_len = user_data->send_len; // Total length of the data to send
				user_data-> sent_len = 0;
				UINT32 offset = 0;
				int flags = LWS_WRITE_BINARY;

				// For loop to send all data in chunks of 1400 bytes
				for (offset = 0; offset < total_len; ) 
				{	
					//on mtu bytes are send out everytime
					UINT32 len_to_send = DEFAULT_MTU_LEN;
				
					if (total_len - offset < DEFAULT_MTU_LEN) 
					{
						len_to_send = total_len - offset;
					}
					flags = lws_write_ws_flags(LWS_WRITE_BINARY, offset == 0, len_to_send < DEFAULT_MTU_LEN);
	
					ret = lws_write(wsi, user_data->playback_buff + LWS_PRE + offset, len_to_send, flags);
					if (ret < 0)
					{
						WS_DBG(RT_ERROR, "why libwebsocket_write fail[%s], %d, sock=%d\n", strerror(errno), ret, user_data->current_fd);
						if (errno == EAGAIN) 
						{
							
							struct timespec t_time;
							t_time.tv_sec = 0;
							t_time.tv_nsec = 100000;//usleep(100);	//sleep, 
							nanosleep(&t_time,0);
							continue;
						}
                                                else
                                                {
						        user_data->bCloseConn = TRUE;
						        break;
                                                }
					}
					else 
					{
						if(ret <len_to_send)
						{
							WS_DBG(RT_ERROR, "why libwebsocket_write  partially sent data[%s], %d, sock=%d\n", strerror(errno), ret, user_data->current_fd);
							// Adjust offset to account for partially sent data
							lwsl_err("ERROR %d / %d writing ws\n", ret, len_to_send);

						}
						offset += len_to_send;
						user_data->sent_len = offset;

					}
				}
				WS_DBG(DEBUG_INFO, "libwebsocket_write len [%d], send len=%d, sock=%d\n",offset, user_data->send_len, user_data->current_fd);
				user_data->playback_data_ready = 0;
				user_data->playback_buff = NULL;
				user_data->send_len = 0;
				pthread_cond_signal(&user_data->playback_data_ready_cond);// Signal video thread to read more data
				pthread_mutex_unlock(&user_data->playback_data_mutex);
				lws_callback_on_writable(wsi);
			}
			else
			{
				pthread_mutex_unlock(&user_data->playback_data_mutex);
				if(user_data->speed > 1)
				{
					lws_set_timer_usecs(wsi, 5 * LWS_MS_PER_SEC);
				}
				else 
				{
					lws_set_timer_usecs(wsi, 10 * LWS_MS_PER_SEC);
				}
			}
		}
		else
		{
			if(user_data->speed > 1)
			{
				lws_set_timer_usecs(wsi, 5 * LWS_MS_PER_SEC);
			}
			else 
			{
				lws_set_timer_usecs(wsi, 10 * LWS_MS_PER_SEC);
			}
		}

pppaulpeter avatar Jun 13 '25 07:06 pppaulpeter

  1. You should not sleep on the event loop, it just makes everything worse. If there is more to send, you can just call lws_callback_on_writable() and you will come back when you can write something (because it hears from poll() or whatever that t can POLLOUT on that socket). If you get a partial send it's telling you to "come back later", not to sleep. The async event loop will cheaply and cleanly come back when you can write something on the wsi just by calling lws_callback_on_writable().

  2. "Lws" is not partially sending, it is fully sending whatever you asked for to the kernel side. The kernel is saying, "I can't deal with all that" and telling you how much it actually used. So this is not an "lws thing", it's a POSIX thing. Typically the reason the kernel can't take more is the tcp window is currently full, every time you receive tcp ACKs from the peer, the kernel makes some more space in the tcp window that you can use to send stuff. So it will be able to send things later as the ACKs come.

  3. It's expensive to copy large things to kernel side every time only to have most of it ignored.

  4. It's not a fatal problem if occasionally there are partial sends, lws will conceal that it has happened

  5. The timer api calculations are all wrong, it's in usec. So you would use LWS_US_PER_MS or LWS_US_PER_SEC

  6. There's normally no need for sleeps or timers. If there is stuff to write, write some, if there is nothing to write, just return. If there is more to write, call lws_callback_on_writable() and return, it will come back when you can write some more. If the window is full, or the connection slow, it may be some time passes before the next WRITEABLE, it's fine since the wait (poll() or whatever) was sleeping for the duration or getting on with processing other events on the event loop.

lws-team avatar Jun 13 '25 08:06 lws-team

  1. You should not sleep on the event loop, it just makes everything worse. If there is more to send, you can just call lws_callback_on_writable() and you will come back when you can write something (because it hears from poll() or whatever that t can POLLOUT on that socket). If you get a partial send it's telling you to "come back later", not to sleep. The async event loop will cheaply and cleanly come back when you can write something on the wsi just by calling lws_callback_on_writable().
  2. "Lws" is not partially sending, it is fully sending whatever you asked for to the kernel side. The kernel is saying, "I can't deal with all that" and telling you how much it actually used. So this is not an "lws thing", it's a POSIX thing. Typically the reason the kernel can't take more is the tcp window is currently full, every time you receive tcp ACKs from the peer, the kernel makes some more space in the tcp window that you can use to send stuff. So it will be able to send things later as the ACKs come.
  3. It's expensive to copy large things to kernel side every time only to have most of it ignored.
  4. It's not a fatal problem if occasionally there are partial sends, lws will conceal that it has happened
  5. The timer api calculations are all wrong, it's in usec. So you would use LWS_US_PER_MS or LWS_US_PER_SEC
  6. There's normally no need for sleeps or timers. If there is stuff to write, write some, if there is nothing to write, just return. If there is more to write, call lws_callback_on_writable() and return, it will come back when you can write some more. If the window is full, or the connection slow, it may be some time passes before the next WRITEABLE, it's fine since the wait (poll() or whatever) was sleeping for the duration or getting on with processing other events on the event loop.

thanks for you guide! just one more question,6. if i keep calling the lws_callback_on_writable api in the ws_callback case LWS_CALLBACK_SERVER_WRITEABLE , will that make the CPU occupancy very high? and if so, how can I prevent it?

pppaulpeter avatar Jun 16 '25 08:06 pppaulpeter

if i keep calling the lws_callback_on_writable api in the ws_callback case LWS_CALLBACK_SERVER_WRITEABLE , will that make the CPU occupancy very high?

Are you seeing high CPU usage? Just say that then... if you look with tcpdump or so, what size data is being sent out? Some of the examples restrict the max send size, if you copied that then you will never be able to saturate the connection to the peer.

lws-team avatar Jun 16 '25 09:06 lws-team

i just test the CPU usage, this is the demo code @lws-team

            if (playbackCtx.isPlaying && playbackCtx.file_fd > 0) 
            {
              
                // Determine the file size
                if(pthread_mutex_trylock(&playbackCtx.mutex)==0)
                {
                    unsigned int total_len = playbackCtx.data_len;
                    unsigned int offset = playbackCtx.sent_len;
                    unsigned int len_to_send = 1400;
                    if(total_len - offset < 1400)
                    {
                        len_to_send = total_len - offset;
                    }

                    if(playbackCtx.data_ready !=0 && playbackCtx.data_len >0)
                    {   
                        int written = lws_write(wsi, playbackCtx.frame_buffer+LWS_PRE+offset, len_to_send, LWS_WRITE_BINARY);
                        if (written < 0) 
                        {
                            printf("Error: lws_write wrote only %d bytes out of %d\n", written, playbackCtx.data_len);
                            playbackCtx.isPlaying = 0;
                            pthread_mutex_unlock(&playbackCtx.mutex);
                            return -1;
                        } 
                        else 
                        {
                            
                            offset +=len_to_send;
                            playbackCtx.sent_len = offset;
                        }
                        //printf("Successfully wrote %d bytes of video data to WebSocket,offset=%d\n", written,offset);
                        if( playbackCtx.sent_len == total_len)
                        {
                            playbackCtx.data_ready = 0;
                            playbackCtx.sent_len =0;
                            playbackCtx.data_len = 0;
                            pthread_cond_signal(&playbackCtx.cond); // Signal that a packet has been added
                        }
                        pthread_mutex_unlock(&playbackCtx.mutex);
                        lws_callback_on_writable(wsi);  // Request another writable callback
                    
                        
                    }
                    else
                    {
                        pthread_mutex_unlock(&playbackCtx.mutex);
                        //usleep(5000);
                        lws_callback_on_writable(wsi);  // Request another writable callback
                    
                    }
                
                    
                }
                else
                {
                    printf("3------can't get the lock playback,playbackCtx.data_ready=%d\n",playbackCtx.data_ready);
                    lws_set_timer_usecs(wsi,10*LWS_US_PER_MS);
                }
                
            }

the cpu occupancy was verry high as shown in the following image if i directly call lws_callback_on_writable when there is no data in shared buffer. is there a good way to wait if there is no data now (there will have data later, the data is prepared in other thread, i can't directly call lws_callback_on_writable as you suggestion before). if i uncomment the usleep(5000), the cpu occupancy will decrease a lot, i know i should not sleep in the non-block callback, but i am not sure what is the better way to do it. Image

pppaulpeter avatar Jun 18 '25 06:06 pppaulpeter

What size packets are actually being sent? Please have a look with wireshark / tcpdump.

Use openssl rather than, eg, mbedtls (which is very slow)

Try building with Release mode at cmake.

lws-team avatar Jun 18 '25 07:06 lws-team

What size packets are actually being sent? Please have a look with wireshark / tcpdump.

Use openssl rather than, eg, mbedtls (which is very slow)

Try building with Release mode at cmake.

just plain tcp,this is the wireshark data

Image

pppaulpeter avatar Jun 18 '25 07:06 pppaulpeter

Yeah but what is the size of the ws packets? If it is around 1400 you can try larger sends, eg 14000 to see what happens.

lws-team avatar Jun 18 '25 07:06 lws-team

What size packets are actually being sent? Please have a look with wireshark / tcpdump.

Use openssl rather than, eg, mbedtls (which is very slow)

Try building with Release mode at cmake.

is the Release a must? so the default mode is debug? this is the conanfile default option i plan to add

Image

Image

pppaulpeter avatar Jun 18 '25 07:06 pppaulpeter

Release mode is not a "must", but it removes various things (like verbose logs) from the build and is more efficient.

I have no idea what your python is about. CMAKE lets you control the build mode with -DCMAKE_BUILD_TYPE=Release or similar.

lws-team avatar Jun 18 '25 07:06 lws-team

Release mode is not a "must", but it removes various things (like verbose logs) from the build and is more efficient.

I have no idea what your python is about. CMAKE lets you control the build mode with -DCMAKE_BUILD_TYPE=Release or similar.

it is conanfile.py, this is the link https://github.com/conan-io/conan-center-index/blob/master/recipes/libwebsockets/all/conanfile.py

pppaulpeter avatar Jun 18 '25 07:06 pppaulpeter

Yeah but what is the size of the ws packets? If it is around 1400 you can try larger sends, eg 14000 to see what happens.

it is the same for cpu usage, probably i should try this for synchroning and data sending instead of using usleep in the callback https://github.com/warmcat/libwebsockets/tree/main/lib/misc/threadpool

pppaulpeter avatar Jun 18 '25 07:06 pppaulpeter

I have no idea where your data comes from. Generally for live video, people are using UDP, not tcp / ws. If the problem is you're spinning with a lot of data to send, doing it in threadpool won't help that. It would help if you are blocking the event loop with some compute-heavy action on the same thread, it doesn't sound like you have that problem.

Could you perhaps answer the question about the size of packets you are emitting + try larger sends?

lws-team avatar Jun 18 '25 07:06 lws-team

If you are doing something cpu-intensive in another thread, that will all be accounted for in your lws process. Perhaps your sleep "improves" cpu usage by throttling the cpu intensive thing that is nothing to do with lws, because you're not clearing the ringbuffer any more.

lws-team avatar Jun 18 '25 07:06 lws-team

I have no idea where your data comes from. Generally for live video, people are using UDP, not tcp / ws. If the problem is you're spinning with a lot of data to send, doing it in threadpool won't help that. It would help if you are blocking the event loop with some compute-heavy action on the same thread, it doesn't sound like you have that problem.

Could you perhaps answer the question about the size of packets you are emitting + try larger sends?

it is 1400 as shown here, unsigned int len_to_send = 1400; now i update it to 14000 bytes, it is the same for cpu usage.

                    else
                    {
                        pthread_mutex_unlock(&playbackCtx.mutex);
                        //usleep(5000);
                        lws_callback_on_writable(wsi);  // Request another writable callback
                    
                    }

there is not too much data to send,the cpu high usage is because of above code(code snippet from the following code) i don't know what to do when there is no data to send.

            if (playbackCtx.isPlaying && playbackCtx.file_fd > 0) 
            {
              
                // Determine the file size
                if(pthread_mutex_trylock(&playbackCtx.mutex)==0)
                {
                    unsigned int total_len = playbackCtx.data_len;
                    unsigned int offset = playbackCtx.sent_len;
                    unsigned int len_to_send = 1400;
                    if(total_len - offset < 1400)
                    {
                        len_to_send = total_len - offset;
                    }

                    if(playbackCtx.data_ready !=0 && playbackCtx.data_len >0)
                    {   
                        int written = lws_write(wsi, playbackCtx.frame_buffer+LWS_PRE+offset, len_to_send, LWS_WRITE_BINARY);
                        if (written < 0) 
                        {
                            printf("Error: lws_write wrote only %d bytes out of %d\n", written, playbackCtx.data_len);
                            playbackCtx.isPlaying = 0;
                            pthread_mutex_unlock(&playbackCtx.mutex);
                            return -1;
                        } 
                        else 
                        {
                            
                            offset +=len_to_send;
                            playbackCtx.sent_len = offset;
                        }
                        //printf("Successfully wrote %d bytes of video data to WebSocket,offset=%d\n", written,offset);
                        if( playbackCtx.sent_len == total_len)
                        {
                            playbackCtx.data_ready = 0;
                            playbackCtx.sent_len =0;
                            playbackCtx.data_len = 0;
                            pthread_cond_signal(&playbackCtx.cond); // Signal that a packet has been added
                        }
                        pthread_mutex_unlock(&playbackCtx.mutex);
                        lws_callback_on_writable(wsi);  // Request another writable callback
                    
                        
                    }
                    else
                    {
                        pthread_mutex_unlock(&playbackCtx.mutex);
                        //usleep(5000);
                        lws_callback_on_writable(wsi);  // Request another writable callback
                    
                    }
                
                    
                }
                else
                {
                    printf("3------can't get the lock playback,playbackCtx.data_ready=%d\n",playbackCtx.data_ready);
                    lws_set_timer_usecs(wsi,10*LWS_US_PER_MS);
                }
                
            }

pppaulpeter avatar Jun 18 '25 07:06 pppaulpeter

If you run out of data in the send ringbuffer, you just return with 0 without calling lws_callback_on_writable(). You will just sleep on the event wait at 0% CPU from lws.

If some other thread produces new data afterwards, it should call lws_cancel_service() from the other thread. This will cause a callback on your protocol for LWS_CALLBACK_EVENT_WAIT_CANCELLED IN THE LWS EVENT LOOP THREAD CONTEXT. In your handler for this, if you see you have data to send in the ringbuffer, you can call lws_callback_on_writable() then to resume sending stuff.

lws-team avatar Jun 18 '25 08:06 lws-team

If you run out of data in the send ringbuffer, you just return with 0 without calling lws_callback_on_writable(). You will just sleep on the event wait at 0% CPU from lws.

If some other thread produces new data afterwards, it should call lws_cancel_service() from the other thread. This will cause a callback on your protocol for LWS_CALLBACK_EVENT_WAIT_CANCELLED IN THE LWS EVENT LOOP THREAD CONTEXT. In your handler for this, if you see you have data to send in the ringbuffer, you can call lws_callback_on_writable() then to resume sending stuff.

it is not easy to use lws_cancel_service to do the sync. it can't be known which thread or connection call the lws_cancel_service api, right?only the thread calling lws_cancel_service could broadcat the message to ws callback thread, how could ws callback thread notify the other thread the sending is done?

pppaulpeter avatar Jun 19 '25 12:06 pppaulpeter

"not easy"... it's very easy isn't it? The other threads can signal to the lws service loop thread that something happened. And you don't have to get involved with how that works. You just "wake up" in the lws service loop thread context, you can take the mutex if you need to worry about which thread wants to do what and look at the shared data.

Or you can call lws_callback_on_writable() from EVENT_WAIT_CANCELLED, and take the mutex and study what exactly needs doing in the WRITEABLE callback.

how could ws callback thread notify the other thread the sending is done?

If your other threads want events when shared things change, you can use sync objects like semaphores (along with changes to shared memory areas) to let them know immediately things have happened. On the lws service loop thread context, you will be in WRITEABLE callback calling lws_write to get things sent, it's from there you can send other sync object signals or lock and changed shared memory areas.

lws-team avatar Jun 19 '25 12:06 lws-team

"not easy"... it's very easy isn't it? The other threads can signal to the lws service loop thread that something happened. And you don't have to get involved with how that works. You just "wake up" in the lws service loop thread context, you can take the mutex if you need to worry about which thread wants to do what and look at the shared data.

Or you can call lws_callback_on_writable() from EVENT_WAIT_CANCELLED, and take the mutex and study what exactly needs doing in the WRITEABLE callback.

how could ws callback thread notify the other thread the sending is done?

If your other threads want events when shared things change, you can use sync objects like semaphores (along with changes to shared memory areas) to let them know immediately things have happened. On the lws service loop thread context, you will be in WRITEABLE callback calling lws_write to get things sent, it's from there you can send other sync object signals or lock and changed shared memory areas.

thanks for you patient. Sorry i am a little confused. this is the code and print log

Image may i know why there are two LWS_CALLBACK_EVENT_WAIT_CANCELLED prints in one lws_cancel_service call ? how to get the connection user data when in case LWS_CALLBACK_EVENT_WAIT_CANCELLED, according to the log each time the user is null.

pppaulpeter avatar Jun 19 '25 13:06 pppaulpeter

The callback for cancel service operates at the event loop level, it basically forcibly wakes the event loop thread. There is no wsi associated with that event.

If you get one or two or 100 LWS_CALLBACK_EVENT_WAIT_CANCELLED, you need to lock the shared memory area in the callback for it, and find out what the other threads want. If nothing to do, just return 0.

If you want to have wsi pointers available at the callback for cancel service, you need to keep your own list of them, add them at the ESTABLISHED callback (assuming it's ws) and remove them at the CLOSED callback. Then (depending on what you find left by the other thread in the shared memory area) you can walk through the live wsi list and ask for the writable() on them.

lws-team avatar Jun 19 '25 13:06 lws-team

The callback for cancel service operates at the event loop level, it basically forcibly wakes the event loop thread. There is no wsi associated with that event.

If you get one or two or 100 LWS_CALLBACK_EVENT_WAIT_CANCELLED, you need to lock the shared memory area in the callback for it, and find out what the other threads want. If nothing to do, just return 0.

If you want to have wsi pointers available at the callback for cancel service, you need to keep your own list of them, add them at the ESTABLISHED callback (assuming it's ws) and remove them at the CLOSED callback. Then (depending on what you find left by the other thread in the shared memory area) you can walk through the live wsi list and ask for the writable() on them.

Hi @Andy,

if the playback thread(data preparing thread) also use the following user data(per_session_data_t)

demo code

typedef struct {
    char *some_buf;
} per_session_data_t;

static int
ws_callback(struct lws *wsi, enum lws_callback_reasons reason,
            void *user, void *in, size_t len)
{
    per_session_data_t *pss = (per_session_data_t *)user;
}

may i know when the ws thread free per_session_data_t after the websocket conenction is closed? probably i should not use per_session_data_t in other thread other than ws_callback,correct? since in ws thread i can't notify playback thread exit easily.

pppaulpeter avatar Jul 07 '25 08:07 pppaulpeter

You can share data between the lws event loop threads and other threads, but you must use a mutex or similar to lock the data while it is being accessed by any thread, or is being deleted (in the _CLOSE callback). And at any time you acquire the mutex, you must check first if the wsi whose data you want to access is still listed as existing, since it may be destroyed at any time around the lws event loop (the _CLOSE callback for the wsi will be closed first).

since in ws thread i can't notify playback thread exit easily.

You can use a semaphore or similar to alert other threads to events from the lws thread.

lws-team avatar Jul 07 '25 08:07 lws-team

			                         pthread_mutex_lock(&user_data->playback_data_mutex);

						user_data->playback_data_ready = 0;
						pthread_cond_signal(&user_data->playback_data_ready_cond);// Signal video thread to exit
						WS_DBG(RT_ERROR, "playback  stop----\n");
						pthread_mutex_unlock(&user_data->playback_data_mutex);

Hi @Andy,

is this kind of code in the _CLOSE callback is safe enough to notify the playback thread to exit ?

pppaulpeter avatar Jul 07 '25 08:07 pppaulpeter

Mostly but:

  • if you can have multiple connections you will have to keep a list of live ones, which you can access under the mutex. This is because at any time the connection can be closed by an intermediary, or the peer, without caring about whatever you might have associated with it on another thread. You can ensure, in the _CLOSE handling, that you will have taken the mutex and removed this connection from your live list, before another thread can take the mutex and try to touch it again.

  • the other threads who want to touch this also have to acquire the mutex and confirm that the wsi they were associated with is still listed as alive, before touching any data that is / used to be associated with the wsi.

You are having to deal with the fact that - however you do it - the lifecycle of the socket + wsi + pss is different than the lifecycle of any processes running in another thread you may have associated with it.

lws-team avatar Jul 07 '25 08:07 lws-team

Mostly but:

  • if you can have multiple connections you will have to keep a list of live ones, which you can access under the mutex. This is because at any time the connection can be closed by an intermediary, or the peer, without caring about whatever you might have associated with it on another thread. You can ensure, in the _CLOSE handling, that you will have taken the mutex and removed this connection from your live list, before another thread can take the mutex and try to touch it again.
  • the other threads who want to touch this also have to acquire the mutex and confirm that the wsi they were associated with is still listed as alive, before touching any data that is / used to be associated with the wsi.

You are having to deal with the fact that - however you do it - the lifecycle of the socket + wsi + pss is different than the lifecycle of any processes running in another thread you may have associated with it.

Hi Andy @lws-team , ok. I think the reason my program hasn’t crashed probably not because you’re freeing it too slowly, but rather that the peer never actively closes the connection. :)

pppaulpeter avatar Jul 07 '25 09:07 pppaulpeter

Sure, until... it does. Eg, the box goes down to reboot or crashes, an intermediate router reboots, internet goes out.

lws-team avatar Jul 07 '25 09:07 lws-team

Sure, until... it does. Eg, the box goes down to reboot or crashes, an intermediate router reboots, internet goes out.

Hi Andy,@lws-team

thanks for your prompt response. i could referece this example, minimal-ws-server-threads, correct?

pppaulpeter avatar Jul 07 '25 12:07 pppaulpeter

Generally, it's useful to understand how the lws_cancel_service() is working and how you can lock things. But in that example, every client that joins sees a "shared world", since the send ringbuffer is stored in the vhd.

If you have a different goal where things are stored by the other threads in the pss, your code will look different as described above, the threads which in that example are spamming to the vhd (which lives as long as the vhost) will instead need to lock the pss (which lives as long as the client socket / wsi). And then they need to take care as described above that the wsi the pss belongs to is still connected.

lws-team avatar Jul 07 '25 12:07 lws-team