libcoap icon indicating copy to clipboard operation
libcoap copied to clipboard

Disable sending Observe de-register on session clear

Open fun-works opened this issue 3 years ago • 14 comments

Environment

  • libcoap version (run git describe --tags to find it): 4.3.0

    // v4.3.0-rc3-41-g25fe796

  • Build System: CMake

  • Operating System: [Windows|Linux|macOS|FreeBSD|Cygwin|Solaris|RIOT|Other (which?)] Linux

  • Operating System Version: [ ]

  • Hosted Environment: [None|Contiki|LwIP|ESP-IDF|Other (which?)] None

Problem Description

I am developing a Client which is intended to communicate with many thread devices installed on a site. The client can go on or off the network at any time temporarily. We have support for Observe which is persistent in nature, which means the devices will store the Client information and keep on sending the notifications to the client. Once the client goes offline and comes back, the notifications should get received by the client without being registered again.

The problem is, when the Client is closing all the active sessions are getting cleared resulting in de-register messages. These messages cause the devices to clear the Observation on their end. Because of which the above use case is not possible.

Do you have any suggestion for this to achieve ? How I can disable the de-register messages from Libcoap ?

fun-works avatar Jul 29 '22 04:07 fun-works

Thanks for bringing this up. May I kindly ask you to clear up the issue description a bit by removing all the issue template stuff that is not need, especially the part that is clearly marked as "Delete Below ... Delete Above".

obgm avatar Jul 29 '22 07:07 obgm

You have a design challenge here. RFC 7641 4.5 states

   A server that transmits notifications mostly in non-confirmable
   messages MUST send a notification in a confirmable message instead of
   a non-confirmable message at least every 24 hours.  This prevents a
   client that went away or is no longer interested from remaining in
   the list of observers indefinitely.

The use of Confirmable (CON) by the server when the client goes away will cause the server to de-register the Observe. When the client starts up again, it will not receive notifications as the server is no longer sending them. This will happen even if the client does not de-register on shutdown.

A secondary issue is if the client on restarting uses a different source port. The Observe responses will then be going to the wrong port.

The safe way to do this is to re-register when the client starts up, and any duplicate registrations should be handled correctly by the server by replacing the previous, now stale, reservation.

mrdeep1 avatar Jul 29 '22 08:07 mrdeep1

I agree with the principle. Regarding the Port number, we enforce the client to use the same Address and Port always to maintain its identity, otherwise it is a new client all together in the system for the devices and the whole handshaking and bootstrap need to be done from scratch(which is not desirable in the deployment).

As per your comment, if I use NON for my observe, libcoap will not sent de-register on session clear ?

fun-works avatar Jul 29 '22 09:07 fun-works

Getting the server to send always NON (by setting the appropriate resource flags) will stop the server de-registering the Observe).

To stop the client de-registering on session close, there will need to be a code change in libcoap.

mrdeep1 avatar Jul 29 '22 09:07 mrdeep1

So, could it be a requirement on libcoap to provide a mechanism to stop CLIENT sending de-register on session close ? Do you think this is a valid use case for a use case mentioned above ?

At the moment we are using 4.3.0 and we can make a local change to disable de-register on session close and could it be a patch on the same version on libcoap repo?

fun-works avatar Jul 29 '22 11:07 fun-works

We have addressed the design challenge you have mentioned above in the following manner: We use a lifetime for each observe request being sent from Client to Server. Server maintains the observer only till the lifetime, and after expiration of this time, server clears the observe session and waits for a re-registration from Client. This was it avoids the indefinite maintenance of observer on server.

The silent period between expire and re-registration is between 3 to 15 seconds as we are not able to send a second observe request from Client to the same server with the same token when another observe is already alive.

fun-works avatar Jul 29 '22 11:07 fun-works

I don’t have access to the code at present, but I thought the server logic on receiving a request for a new Observe on the same resource with a different Token dropped the previous Observe and continued the Observe with the new Token.

mrdeep1 avatar Jul 29 '22 12:07 mrdeep1

So, could it be a requirement on libcoap to provide a mechanism to stop CLIENT sending de-register on session close ? Do you think this is a valid use case for a use case mentioned above ?

At the moment we are using 4.3.0 and we can make a local change to disable de-register on session close and could it be a patch on the same version on libcoap repo?

Do you have any suggestion for this?

fun-works avatar Jul 29 '22 16:07 fun-works

What you want to do is a specific case as to the cancelling of the de-register by the client logic. In the general case, assuming the client code does not crash, it should be sending a de-register request so that the server is able to remove a no longer required observer registration in a reasonable time frame rather then waiting for an unsolicited observe response to time out.

To support your change in libcoap, it needs to either be a compile time option (where all users of that built library operate the same), or a runtime option instructing the session on closure not to send out a de-register.

The function you need to prevent being called is coap_cancel_observe() in src/coap_session.c.

mrdeep1 avatar Jul 29 '22 18:07 mrdeep1

If you are not using Block2 transfers (Observe data < 1024 bytes), then you may be able to get away with commenting out coap_context_set_block_mode(ctx, block_mode); and coap_cancel_observe(session, &the_token, msgtype); in the code of examples/coap-client.c which I assume you are modelling your code on. No need to change the server code.

The downside of this is that the client's libcoap is not instructed to do any block handling and it is the responsibility of the application to understand that only some of the data has been transferred and has to request the next block of data etc. See coap_block(3) for more information.

mrdeep1 avatar Jul 29 '22 20:07 mrdeep1

What you want to do is a specific case as to the cancelling of the de-register by the client logic. In the general case, assuming the client code does not crash, it should be sending a de-register request so that the server is able to remove a no longer required observer registration in a reasonable time frame rather then waiting for an unsolicited observe response to time out.

To support your change in libcoap, it needs to either be a compile time option (where all users of that built library operate the same), or a runtime option instructing the session on closure not to send out a de-register.

The function you need to prevent being called is coap_cancel_observe() in src/coap_session.c.

We prevented calling of coap_cancel_observe() in src/coap_session.c Below code snippet from void coap_session_mfree(coap_session_t *session) Method /* Need to do this before (D)TLS and socket is closed down */ LL_FOREACH_SAFE(session->lg_crcv, cq, etmp) { if (cq->observe_set) { /* Need to close down observe */ // if (coap_cancel_observe(session, cq->app_token, COAP_MESSAGE_NON)) { /* Need to delete node we set up for NON */ coap_queue_t *queue = session->context->sendqueue; while (queue) { if (queue->session == session) { coap_delete_node(queue); break; } queue = queue->next; } } } LL_DELETE(session->lg_crcv, cq); coap_block_delete_lg_crcv(session, cq); }

But we are getting **4bytes memory leak** for one single Observer request on closing the session.

Please tell us how to free that memory?

Note: We have enabled block option by calling `coap_context_set_block_mode()` we are not using block transfer with observe . We are using it for other use cases.

ganeshkhadre avatar Aug 03 '22 09:08 ganeshkhadre

In making the following change in my code

diff --git a/examples/coap-client.c b/examples/coap-client.c
index 525c8e2..8698539 100644
--- a/examples/coap-client.c
+++ b/examples/coap-client.c
@@ -1866,7 +1866,7 @@ main(int argc, char **argv) {
           coap_log(LOG_DEBUG, "clear observation relationship\n" );
           for (i = 0; i < tracked_tokens_count; i++) {
             if (tracked_tokens[i].observe) {
-              coap_cancel_observe(session, tracked_tokens[i].token, msgtype);
+//              coap_cancel_observe(session, tracked_tokens[i].token, msgtype);
             }
           }
           doing_observe = 0;
diff --git a/src/coap_session.c b/src/coap_session.c
index e21f331..dbe88e4 100644
--- a/src/coap_session.c
+++ b/src/coap_session.c
@@ -230,7 +230,7 @@ void coap_session_mfree(coap_session_t *session) {
   LL_FOREACH_SAFE(session->lg_crcv, cq, etmp) {
     if (cq->observe_set) {
       /* Need to close down observe */
-      if (coap_cancel_observe(session, cq->app_token, COAP_MESSAGE_NON)) {
+//      if (coap_cancel_observe(session, cq->app_token, COAP_MESSAGE_NON)) {
         /* Need to delete node we set up for NON */
         coap_queue_t *queue = session->context->sendqueue;
 
@@ -241,7 +241,7 @@ void coap_session_mfree(coap_session_t *session) {
           }
           queue = queue->next;
         }
-      }
+//      }
     }
     LL_DELETE(session->lg_crcv, cq);
     coap_block_delete_lg_crcv(session, cq);

and I then run the otherwise standard coap-client against the standard coap-server, I get

$  valgrind examples/coap-client -s 1 coap://127.0.0.1/time -v7 -w
==30426== Memcheck, a memory error detector
==30426== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==30426== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==30426== Command: examples/coap-client -s 1 coap://127.0.0.1/time -v7 -w
==30426== 
Aug 03 14:12:06.177 DEBG ***127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : session 0x5cb8380: created outgoing session
Aug 03 14:12:06.358 DEBG timeout is set to 90 seconds
Aug 03 14:12:06.361 DEBG sending CoAP request:
Aug 03 14:12:06.406 DEBG PDU presented by app
v:1 t:CON c:GET i:69bc {01} [ Observe:0, Uri-Path:time ]
Aug 03 14:12:06.436 DEBG ** 127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : lg_crcv 0x5cb9b10 initialized - stateless token xxxx000000000002
Aug 03 14:12:06.457 DEBG *  127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : sent 11 bytes
v:1 t:CON c:GET i:69bc {01} [ Observe:0, Uri-Path:time ]
Aug 03 14:12:06.470 DEBG ** 127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : mid=0x69bc: added to retransmit queue (2531ms)
Aug 03 14:12:06.493 DEBG *  127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : received 25 bytes
v:1 t:ACK c:2.05 i:69bc {01} [ Observe:2, Max-Age:1 ] :: 'Aug 03 13:12:06'
Aug 03 14:12:06.523 DEBG ** 127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : mid=0x69bc: removed 1
Aug 03 14:12:06.547 DEBG ** process incoming 2.05 response:
Aug 03 14:12:06.552 DEBG observation relationship established, set timeout to 1
Aug 03 13:12:06
Aug 03 14:12:07.001 DEBG *  127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : received 25 bytes
v:1 t:CON c:2.05 i:585a {01} [ Observe:3, Max-Age:1 ] :: 'Aug 03 13:12:07'
Aug 03 14:12:07.006 DEBG ** process incoming 2.05 response:
Aug 03 13:12:07
Aug 03 14:12:07.008 DEBG *  127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : sent 4 bytes
v:1 t:ACK c:0.00 i:585a {} [ ]
Aug 03 14:12:07.574 DEBG clear observation relationship
Aug 03 14:12:07.586 DEBG ** 127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : lg_crcv 0x5cb9b10 released
Aug 03 14:12:07.595 DEBG ***127.0.0.1:60694 <-> 127.0.0.1:5683 UDP : session 0x5cb8380: closed

==30426== 
==30426== HEAP SUMMARY:
==30426==     in use at exit: 0 bytes in 0 blocks
==30426==   total heap usage: 55 allocs, 55 frees, 6,794 bytes allocated
==30426== 
==30426== All heap blocks were freed -- no leaks are possible
==30426== 
==30426== For counts of detected and suppressed errors, rerun with: -v
==30426== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 6)

with no loss of memory. 4 bytes is a small amount to lose, but I do not know where you are seeing this.

mrdeep1 avatar Aug 03 '22 13:08 mrdeep1

Thanks for feedback! You are right there is no memory leak.

The memory leak is in our client application.

ganeshkhadre avatar Aug 04 '22 11:08 ganeshkhadre

@fun-works @ganeshkhadre Now that #897 has been merged into the develop branch, please confirm that this additional functionality of disabling observe cancellation on session close works for you.

mrdeep1 avatar Aug 08 '22 11:08 mrdeep1

Closed as #897 is now part of the develop branch.

mrdeep1 avatar Aug 18 '22 11:08 mrdeep1