shadowsocks-rust icon indicating copy to clipboard operation
shadowsocks-rust copied to clipboard

can add udp mtu configuration option?

Open ziojacky opened this issue 2 years ago • 17 comments

specify sslocal and ssserver udp mtu option.in some network environments, not all mtu are consistent. If the intermediate route does not enable pmtud, fragmentation will occur and various questions.

ziojacky avatar Nov 30 '23 06:11 ziojacky

MTU should be changed through network interfaces. What do you expect shadowsocks as an application could do with the interfaces’ MTU size?

zonyitoo avatar Dec 01 '23 07:12 zonyitoo

MTU should be changed through network interfaces. What do you expect shadowsocks as an application could do with the interfaces’ MTU size?

my local mtu is 1400 and server mtu is 1500 will lead to data fragmentation,so hope application have function can set send packet size (mtu-ip header size-udp header size)

ziojacky avatar Dec 01 '23 10:12 ziojacky

This indeed would be something really nice (and important IMO) to have in shadowsocks-rust. My project shadowsocks-go was written with this in mind and has a mandatory MTU option for both client and server configurations. On top of my mind I can think of the following benefits:

  • Hint for padding size. For Shadowsocks 2022 UDP clients and servers, when calculating the padding size, the MTU is taken into account to ensure that padding won't ever cause IP fragmentation.
  • Hint for buffer size. Especially if you are going to implement sendmmsg/recvmmsg and UDP GSO/GRO, knowing the maximum packet size beforehand saves you from allocating unnecessary buffer space.

database64128 avatar Dec 01 '23 12:12 database64128

The configured MTU value is just for hinting?

zonyitoo avatar Dec 01 '23 15:12 zonyitoo

The configured MTU value is just for hinting?

not just a reminder hinting,it is an actual function. you can customize the size of the packet data that is forced to be sent, so that the data packet size configuration can be completed in the application part. otherwise, unnecessary fragmentation will occur in the mtu part,especially when you know that the mtu between your client and server is different and the minimum mtu is what, this may be very practical.

ziojacky avatar Dec 02 '23 02:12 ziojacky

So in practice, the mtu value in configuration, will limits the buffer size that used to recv() from the UDP socket, which is mtu - tag_size - header_size in sslocal's recv(), so that it could ensure the UDP packet sent from sslocal to ssserver never exceed mtu.

zonyitoo avatar Dec 03 '23 01:12 zonyitoo

So in practice, the mtu value in configuration, will limits the buffer size that used to recv() from the UDP socket, which is mtu - tag_size - header_size in sslocal's recv(), so that it could ensure the UDP packet sent from sslocal to ssserver never exceed mtu.

yes,you are right,is it possible to add such functionality?

ziojacky avatar Dec 03 '23 01:12 ziojacky

Maybe. I haven’t look deep into it yet.

Theoretically, the mtu configuration is only required in sslocal.

zonyitoo avatar Dec 03 '23 04:12 zonyitoo

So in practice, the mtu value in configuration, will limits the buffer size that used to recv() from the UDP socket, which is mtu - tag_size - header_size in sslocal's recv()

Don't forget to subtract the IPv4 header length (20 bytes) and UDP header length (8 bytes). On recvfrom(2) we don't yet know whether the server chosen by the load balancer has an IPv4 or IPv6 address, so we just subtract the smaller one.

so that it could ensure the UDP packet sent from sslocal to ssserver never exceed mtu.

On Unix systems, when the message does not fit in the supplied buffer, the default behavior is to truncate the message and indicate this in the flags, which is up to the application to check.

The most portable way to implement the check is to always use recvmsg(2) and check the returned flag for MSG_TRUNC. This is how I implemented it.

There are also other ways on some common platforms. On Linux you can specify MSG_TRUNC in recv{from,msg}(2)'s flags argument to make it always return the real size before truncation. On Windows you do not need to check for MSG_PARTIAL, as it always returns the error -WSAEMSGSIZE.

database64128 avatar Dec 03 '23 10:12 database64128

Ok. But what should application do to handle these oversized packets? Just ignore them?

zonyitoo avatar Dec 03 '23 10:12 zonyitoo

Ok. But what should application do to handle these oversized packets? Just ignore them?

My take is to print a warning and drop the packet.

Theoretically, the mtu configuration is only required in sslocal.

It'd be nice to have it on the server as well. When you know the exact MTU and that your applications do not rely on IP fragmentation, you can use smaller buffers for server -> client relay to reduce memory usage.

database64128 avatar Dec 03 '23 10:12 database64128

It’s true about lowering the memory consumption, but that’s quite small comparing to the other part of the application.

BTW, the mtu could be used to limit the buffer size for receiving from remote targets, which in summary, the mtu should be applied to the receive buffers in both sides, which are all contain “plain” data.

zonyitoo avatar Dec 03 '23 10:12 zonyitoo

Another reason to implement the MTU option: QUIC. Many QUIC clients use DPLPMTUD (RFC 8899) to probe the PMTU and determine the best packet size. If the proxy program allows IP fragmentation and uses a large buffer size, it might confuse the QUIC client and causes it to select a packet size that's too large for the actual path.

database64128 avatar Dec 03 '23 10:12 database64128

but that’s quite small comparing to the other part of the application.

It's actually quite significant in my experience, especially after I implemented recvmmsg(2) and sendmmsg(2). I had to significantly reduce the number of message buffers to avoid getting OOM killed on my VPS.

database64128 avatar Dec 03 '23 11:12 database64128

BTW, I just double check the code, we have already set IP_PMTUDISC_DO on IP_MTU_DISCOVER sockopt, which will:

For non-SOCK_STREAM sockets, IP_PMTUDISC_DO forces the don't-fragment flag to be set on all outgoing packets. It is the user's responsibility to packetize the data in MTU- sized chunks and to do the retransmits if necessary. The kernel will reject (with EMSGSIZE) datagrams that are bigger than the known path MTU.

So the problem here is: the packet was bigger than the path MTU, but system didn't reject it with EMSGSIZE?

zonyitoo avatar Dec 25 '23 17:12 zonyitoo

Linux and Windows seem to be the only platforms where socket options are properly implemented for dual-stack (IPV6_V6ONLY off) IPv6 sockets. On these platforms you can call setsockopt(IPPROTO_IP, IP_MTU_DISCOVER, IP_PMTUDISC_DO) on a dual-stack IPv6 socket and it'll apply to incoming IPv4 packets. And a separate setsockopt(IPPROTO_IPV6, IPV6_MTU_DISCOVER, IP_PMTUDISC_DO) call will apply the same thing to incoming IPv6 packets.

Unfortunately, on common BSD-derived platforms like FreeBSD and macOS, calling setsockopt(IPPROTO_IP, IP_DONTFRAG) will fail on dual-stack IPv6 sockets, and setsockopt(IPPROTO_IPV6, IPV6_DONTFRAG) does not take effect on incoming IPv4 packets. We might want to cease the use of dual-stack IPv6 sockets on these platforms, if we want to make sure IP fragmentation is properly disabled.

database64128 avatar Dec 26 '23 02:12 database64128

I am also experiencing this problem, but I don't know how I should change it, my router maximum allowed configured MTU is 1492, after I start sslocal, I use curl --socks 127.0.0.1 google.com, at this point the logs of sslocal shows:

2024-01-24T17:08:48.970760769+08:00 TRACE [1945:140361984079616] [shadowsocks::relay::udprelay::proxy_socket] UDP server client send to 3.36.68.17:36000, control: UdpSocketControlData { client_session_id: 18340853489643199601, server_session_id: 0, packet_id: 7, user: None }, payload length 1458 bytes, packet length 1513 bytes
2024-01-24T17:08:48.970924051+08:00 DEBUG [1945:140361984079616] [shadowsocks_service::local::net::udp::association] 192.168.31.2:35515 -> 3.36.68.17:36000 (proxied) sending 1458 bytes failed, error: Message too long (os error 90)
2024-01-24T17:08:48.971078161+08:00 TRACE [1945:140361984079616] [shadowsocks_service::local::net::udp::association] udp relay 192.168.31.2:35515 -> 3.36.68.17:36000 

theowenyoung avatar Jan 24 '24 09:01 theowenyoung