session::send_solicit spuriously fails to find an interface in auto mode
I recently rebooted the gateway hosting a few qemu instances that ndppd proxies (e.g. via virbr0). It's possible that this also caused ndppd to be updated, although I'm not sure about that. These problems did start recently though.
What I'm seeing is spurious failures to reach single /128, sometimes multiple, sometimes just a single one on the qemu guests.
E.g. here I just tried to ping 2001:41d0:8:1580::3 from an external host (via eth0).
ndppd -v -v -v (Debian's 0.2.5-6 on kernel 5.9.0-3-amd64) logs:
(debug) checking 2001:41d0:8:1500::/56 against 2001:41d0:8:1580::3
(debug) session::create() pr=4b555b70, saddr=fe80::218:74ff:fec3:3c00, daddr=ff02::1:ff00:3, taddr=2001:41d0:8:1580::3 =4b55de00
(debug) session::send_solicit() (_ifaces.size() = 0)
(debug) session is now invalid
I think _ifaces.size() = 0 means that no interface was found for this route.
Config:
proxy eth0 {
rule 2001:41d0:8:1500::/56 {
auto
}
}
At the same time on the host:
$ ip ro get 2001:41d0:8:1580::3
2001:41d0:8:1580::3 from :: dev virbr0 proto kernel src 2001:41d0:8:1580:ffff:ffff:ffff:ffff metric 256 pref medium
$ ip neigh show | grep 2001:41d0:8:1580::3
2001:41d0:8:1580::3 dev virbr0 lladdr 52:54:00:07:9f:e1 STALE
$ ping 2001:41d0:8:1580::3
PING 2001:41d0:8:1580::3(2001:41d0:8:1580::3) 56 data bytes
64 bytes from 2001:41d0:8:1580::3: icmp_seq=1 ttl=64 time=0.405 ms
^C
--- 2001:41d0:8:1580::3 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.405/0.405/0.405/0.000 ms
$ ip neigh show | grep 2001:41d0:8:1580::3
2001:41d0:8:1580::3 dev virbr0 lladdr 52:54:00:07:9f:e1 DELAY
$ sudo brctl showmacs virbr0
port no mac addr is local? ageing timer
1 52:54:00:07:9f:e1 no 0.88
1 fe:54:00:07:9f:e1 yes 0.00
1 fe:54:00:07:9f:e1 yes 0.00
$ ip -6 ro
::1 dev lo proto kernel metric 256 pref medium
2001:41d0:8:1501::/64 dev virbr1 proto kernel metric 256 linkdown pref medium
2001:41d0:8:1580::/64 dev virbr0 proto kernel metric 256 pref medium
2001:41d0:8:1500::/56 dev eth0 proto kernel metric 256 pref medium
2001:41d0:8:1500::/56 dev eth0 proto ra metric 1024 expires 2576879sec pref medium
fe80::/64 dev vpnbr0 proto kernel metric 256 pref medium
fe80::/64 dev virbr0 proto kernel metric 256 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev vnet0 proto kernel metric 256 pref medium
fe80::/64 dev vnet1 proto kernel metric 256 pref medium
default via 2001:41d0:8:15ff:ff:ff:ff:ff dev eth0 proto static metric 1024 onlink pref medium
Logs from when it works (no config changes, restart or anything in between):
(debug) proxy::handle_solicit() saddr=fe80::218:74ff:fec3:3c00, taddr=2001:41d0:8:1580::3
(debug) checking 2001:41d0:8:1500::/56 against 2001:41d0:8:1580::3
(debug) session::create() pr=1e1fdb70, saddr=fe80::218:74ff:fec3:3c00, daddr=ff02::1:ff00:3, taddr=2001:41d0:8:1580::3 =1e1ff330
(debug) router::ifa() opening interface 'virbr0'
(debug) fd=2, hwaddr=fe:54:0:7:9f:e1
(debug) session::send_solicit() (_ifaces.size() = 1)
(debug) - virbr0
(debug) iface::write_solicit() taddr=2001:41d0:8:1580::3, daddr=ff02::1:ff00:3
I guess I could try to work around this by not using auto, but this was working fine for months/years with auto.
I tried adding a (auto) rule for 2001:41d0:8:1580::/64, but this didn't have any effect.
Did some more testing. and things seem to be working fine when adding:
rule 2001:41d0:8:1580::/64 {
iface virbr0
}
Did an earlier version work, that is do you reckon this is a regression in later 0.x? 0.x is using /proc for parsing routes, and it does it in intervals.
Yes, it was working fine for at least two years.