After deconnection, variable delays to reconnect via Call Home observed on v2
I switched recently to the following software version from the devel branch(v2) : Netopeer2 v2.0.35
Scenario : In CLI, alternative listen/disconnect cmd leads to the following logs at Server level. Observation : listen cmd is stuck for an randomize time linked to the time xyz displayed in following Server logs : ==> [INF]: LN: Call Home client "default-client" reconnecting in xyz seconds. This behavior was not observed in Netopeer2 v1
Full logs below with several trials of sequence listen/disconnect at CLI level along with the Call Hom configuration in place in the datastore. Time to time, we can see that xyz is different.
[INF]: SR: Successful processing of "rpc" event with ID 2 priority 0 (remaining 0 subscribers). [INF]: SR: Event "rpc" with ID 2 priority 0 succeeded.
===> Disconnect Call Home from CLI
[INF]: NP: Session 3: thread 0 event new RPC. [INF]: NP: Session 2: thread 3 event new RPC. [INF]: NP: Session 2: thread 3 event session terminated. [INF]: SR: Published event "notif" "ietf-netconf-notifications" with ID 10 priority 0 for 2 subscribers. [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 10. [INF]: SR: Successful processing of "notif" event with ID 10 priority 0 (remaining 0 subscribers). [INF]: NP: Generated new event (netconf-session-end). [INF]: LN: Call Home client "default-client" session terminated. [INF]: LN: Call Home client "default-client" reconnecting in 117 seconds. [ERR]: LN: Session 2: Waiting for Call Home thread failed (Connection timed out). [2021/11/28 07:25:58.583001, 1] ssh_socket_exception_callback: Socket exception callback: 1 (0) [2021/11/28 07:25:58.583032, 1] ssh_socket_exception_callback: Socket error: disconnected
===> reconnection with Call Home frozen during the 117 seconds timeout ....
[INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 11. [INF]: SR: Successful processing of "notif" event with ID 11 priority 0 (remaining 1 subscribers). [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 12. [INF]: SR: Successful processing of "notif" event with ID 12 priority 0 (remaining 1 subscribers). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4334. [INF]: LN: Successfully connected to 172.1.1.5:4334 over IPv4. [INF]: SR: Session 177 (user "root", CID 3) created.
===> Disconnect Call Home from CLI
[INF]: NP: Session 4: thread 3 event new RPC. [INF]: NP: Session 4: thread 3 event session terminated. [INF]: SR: Published event "notif" "ietf-netconf-notifications" with ID 16 priority 0 for 2 subscribers. [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 16. [INF]: SR: Successful processing of "notif" event with ID 16 priority 0 (remaining 0 subscribers). [INF]: NP: Generated new event (netconf-session-end). [INF]: LN: Call Home client "default-client" session terminated. [INF]: LN: Call Home client "default-client" reconnecting in 26 seconds. [ERR]: LN: Session 4: Waiting for Call Home thread failed (Connection timed out). [2021/11/28 07:32:27.152690, 1] ssh_socket_exception_callback: Socket exception callback: 1 (0) [2021/11/28 07:32:27.152716, 1] ssh_socket_exception_callback: Socket error: disconnected
====> reconnection with Call Home frozen during the 26 seconds timeout ....
[ERR]: SR: Internal error (/opt/dev/sysrepo/src/common.c:3886). [ERR]: SR: Internal error (/opt/dev/sysrepo/src/common.c:3886). [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 17. [INF]: SR: Successful processing of "notif" event with ID 17 priority 0 (remaining 0 subscribers). [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 18. [INF]: SR: Successful processing of "notif" event with ID 18 priority 0 (remaining 1 subscribers). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4334. [INF]: LN: Successfully connected to 172.1.1.5:4334 over IPv4. [INF]: SR: Session 210 (user "root", CID 3) created.
====> Disconnect Call Home from CLI
[INF]: NP: Session 5: thread 1 event new RPC. [INF]: NP: Session 5: thread 1 event session terminated. [INF]: SR: Published event "notif" "ietf-netconf-notifications" with ID 20 priority 0 for 2 subscribers. [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 20. [INF]: SR: Successful processing of "notif" event with ID 20 priority 0 (remaining 1 subscribers). [INF]: NP: Generated new event (netconf-session-end). [INF]: LN: Call Home client "default-client" session terminated. [INF]: LN: Call Home client "default-client" reconnecting in 3 seconds. [ERR]: LN: Session 5: Waiting for Call Home thread failed (Connection timed out). [2021/11/28 07:34:04.743739, 1] ssh_socket_exception_callback: Socket exception callback: 1 (0) [2021/11/28 07:34:04.743766, 1] ssh_socket_exception_callback: Socket error: disconnected
====> reconnection with Call Home frozen during the 3 seconds timeout ....
[INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4335. [INF]: LN: getsockopt() error (Connection refused). [INF]: LN: Trying to connect via IPv4 to 172.1.1.5:4334. [INF]: LN: getsockopt() error (Connection refused).
=====> Disconnect Call Home from CLI
[INF]: NP: Session 6: thread 1 event new RPC. [INF]: NP: Session 6: thread 1 event session terminated. [INF]: SR: Published event "notif" "ietf-netconf-notifications" with ID 24 priority 0 for 2 subscribers. [INF]: SR: Processing "notif" "ietf-netconf-notifications" event with ID 24. [INF]: SR: Successful processing of "notif" event with ID 24 priority 0 (remaining 0 subscribers). [INF]: NP: Generated new event (netconf-session-end). [INF]: LN: Call Home client "default-client" session terminated. [INF]: LN: Call Home client "default-client" reconnecting in 115 seconds. [ERR]: LN: Session 6: Waiting for Call Home thread failed (Connection timed out). [2021/11/28 07:35:56.927858, 1] ssh_socket_exception_callback: Socket exception callback: 1 (0) [2021/11/28 07:35:56.927885, 1] ssh_socket_exception_callback: Socket error: disconnected
======> reconnection with Call Home frozen during the 115 seconds timeout ....
> get --defaults report-all --filter-xpath /netconf-server/call-home
DATA
<data xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<netconf-server xmlns="urn:ietf:params:xml:ns:yang:ietf-netconf-server">
<call-home>
<netconf-client>
<name>default-client</name>
<endpoints>
<endpoint>
<name>default-tls</name>
<tls>
<tcp-client-parameters>
<remote-address>172.1.1.5</remote-address>
<remote-port>4335</remote-port>
<keepalives>
<idle-time>1</idle-time>
<max-probes>10</max-probes>
<probe-interval>5</probe-interval>
</keepalives>
</tcp-client-parameters>
<tls-server-parameters>
<server-identity>
<keystore-reference>
<asymmetric-key>serverkey</asymmetric-key>
<certificate>servercert</certificate>
</keystore-reference>
</server-identity>
<client-authentication>
<cert-maps>
<cert-to-name>
<id>1</id>
<fingerprint>02:E9:38:1F:F6:8B:62:DE:0A:0B:C5:03:81:A8:03:49:A0:00:7F:8B:F3</fingerprint>
<map-type xmlns:x509c2n="urn:ietf:params:xml:ns:yang:ietf-x509-cert-to-name">x509c2n:specified</map-type>
<name>tls-test</name>
</cert-to-name>
</cert-maps>
<required/>
<ca-certs>cacerts</ca-certs>
<client-certs>clientcerts</client-certs>
</client-authentication>
</tls-server-parameters>
</tls>
</endpoint>
<endpoint>
<name>default-ssh</name>
<ssh>
<tcp-client-parameters>
<remote-address>172.1.1.5</remote-address>
<remote-port>4334</remote-port>
<keepalives>
<idle-time>1</idle-time>
<max-probes>10</max-probes>
<probe-interval>5</probe-interval>
</keepalives>
</tcp-client-parameters>
<ssh-server-parameters>
<server-identity>
<host-key>
<name>default-key</name>
<public-key>
<keystore-reference>genkey</keystore-reference>
</public-key>
</host-key>
</server-identity>
<client-authentication>
<supported-authentication-methods>
<publickey/>
<passsword/>
<other>interactive</other>
</supported-authentication-methods>
</client-authentication>
</ssh-server-parameters>
</ssh>
</endpoint>
</endpoints>
<connection-type>
<periodic>
<period>2</period>
<idle-timeout>0</idle-timeout>
</periodic>
</connection-type>
<reconnect-strategy>
<start-with>first-listed</start-with>
<max-attempts>3</max-attempts>
</reconnect-strategy>
</netconf-client>
</call-home>
</netconf-server>
</data>
Okay, the timeout was being anchor when it should not have been, fixed in the latest libnetconf2. But I am more interested in the errors so please include the exact steps you took when connecting to/disconnecting from the server because from logs it seems you have also subscribed to notifications by some client, which I think may be the root of the problem.