Expose TuneD API to the Unix Domain Socket.
TuneD is listening on paths specified in config in option unix_socket_paths and send signals to paths in option unix_socket_signal_paths. Example call:
ssock_file = "/tmp/tuned_socket" csock_file = "/tmp/tuned_socket_client" if os.path.exists(csock_file): os.remove(csock_file) s=socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM) s.bind(csock_file) try: s.sendto(pickle.dumps(("switch_profile", "balanced")), ssock_file) print(pickle.loads(s.recvfrom(4096)[0])) except Exception as e: print("E: %s" % e) s.close()
This PR also introduce possibility to disable dbus API in main TuneD config.
Resolves: rhbz#2113900
Signed-off-by: Jan Zerdik [email protected]
This pull request introduces 3 alerts when merging 9e71402032e4c262602db7d8a44c77aced56fed7 into 420267fa3463ef4a06885847c7ca4cd24c781dc8 - view on LGTM.com
new alerts:
- 3 for Unused import
This pull request introduces 3 alerts when merging 1fa7493c6db7135d251ac61cc9111884debda753 into 420267fa3463ef4a06885847c7ca4cd24c781dc8 - view on LGTM.com
new alerts:
- 3 for Unused import
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine :gear: that powers LGTM.com. For more information, please check out our post on the GitHub blog.
During testing, I noticed I can DoS TuneD over the socket like this:
printf '\x00\x00\x00\x03' | nc -u -U /run/tuned/tuned.sock
printf 'DoS' | nc -u -U /run/tuned/tuned.sock
2022-11-22 11:07:55,987 ERROR tuned.exports.unix_socket_exporter: Failed to load json data 'b'DoS'': Expecting value: line 1 column 1 (char 0)
Exception in thread Thread-1 (_thread_code):
Traceback (most recent call last):
File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 185, in period_check
data = json.loads(data.decode("utf-8"))
File "/usr/lib64/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.10/site-packages/tuned/daemon/daemon.py", line 221, in _thread_code
exports.period_check()
File "/usr/lib/python3.10/site-packages/tuned/exports/__init__.py", line 46, in period_check
return ctl.period_check()
File "/usr/lib/python3.10/site-packages/tuned/exports/controller.py", line 54, in period_check
exporter.period_check()
File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 188, in period_check
self._send_data(self._socket_object, address, "err", "Failed to load json data '%s': %s" % (data, e))
File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 131, in _send_data
s.sendto(struct.pack("!sL", type.encode("utf-8"), len(j)), address)
FileNotFoundError: [Errno 2] No such file or directory
While the client should be trusted and send valid input, it would be nice if the server (TuneD) wouldn't fall over that easily.
...
While the client should be trusted and send valid input, it would be nice if the server (TuneD) wouldn't fall over that easily.
@jmencak Did you pull the newest version? I added some tries around json loading yesterday, so it should handle invalid requests better.
Did you pull the newest version? I added some tries around json loading yesterday, so it should handle invalid requests better.
Yes, I did, you cannot reproduce this?
It would also help documenting the API a bit more. It looks like I cannot send both the payload length header and JSON payload in one go (over one sendto call).
You can, I'll write better example. I found out what's problem with the exception. ncat does not wait for response, deletes it's socket before we process the request and try to send response to non existing file, so it fails.
You can, I'll write better example.
Great, in the end though, we'll probably need a golang client. I managed to write one.
I found out what's problem with the exception. ncat does not wait for response, deletes it's socket before we process the request and try to send response to non existing file, so it fails
Yeah, I can see that now. However, that's no reason for the server to fail like that. Imagine the client doesn't care about the response like the netcat example or just crashes before getting a response back. That's a valid use case for me in my point of view and the server should not fail.
@jmencak Definitely, I added warning log when sending msg fails with reason. Also changed default path and permissions and added ownership option.
Definitely, I added warning log when sending msg fails with reason. Also changed default path and permissions and added ownership option.
@CZerta Have you tried testing with socat? There are basically two native tools we can use in our environment with this. nc and socat. nc seems to work for sending, but I do not believe it is possible to receive data through it at the same time. socat seems a lot more complex tool that might be able to do this, but at the moment I'm only able to do the same thing as with nc -- just send data:
socat -ddd - UNIX-CONNECT:/run/tuned/tuned.sock
Edit: additional info. With HAProxy socket, for example, the interaction is as simple as:
echo show env | socat -ddd - UNIX-CONNECT:/var/lib/haproxy/run/haproxy.sock
Good news, sort of. I was able to make this work with socat:
send() {
local data="$1"
local len=${#data}
local b0=$(printf '%02x' ${len}) # TODO: this is limited to the payload length of 255 B
local b1=00
local b2=00
local b3=00
local socket=/run/tuned/tuned.sock
printf "\x$b3\x$b2\x$b1\x$b0" | socat - UNIX-SENDTO:$socket,bind=/tmp/client.sock
printf "$data" | socat - UNIX-SENDTO:$socket,bind=/tmp/client.sock
}
send '["profiles"]'
However, it seems not to always succeed. One example of a failure from TuneD:
2022-12-14 16:55:00,507 ERROR tuned.exports.unix_socket_exporter: Failed to load data of message: [Errno 11] Resource temporarily unavailable
This needs to be 100% reliable and working with netcat or socat. I wonder if we should switch from SOCK_DGRAM to SOCK_STREAM as HAProxy does, for example, for their stats socket.
/cc @MarSik
Thank you for the latest commit, Honzo! This is great progress not having to use the payload length and just:
printf '["active_profile"]' | nc -U /run/tuned/tuned.sock # or
printf '["active_profile"]' | socat - UNIX-CONNECT:/run/tuned/tuned.sock
The default socat call often fails to output any result (with WARNING tuned.exports.unix_socket_exporter: Failed send data '<profile-name>' type 'res': [Errno 32] Broken pipe in TuneD logs). This is due to client disconnects, where nc seems to wait (longer?) for a response, whereas the default 0.5s timeout for socat is just not sufficient for this server-side implementation.
printf '["active_profile"]' | socat -t5 - UNIX-CONNECT:/run/tuned/tuned.sock
seems to work every time.
Testing against HAProxy implementation I had no such issues. So at this point I'm wondering whether:
- it is worth/possible/practical improving the server-side (TuneD) implementation to improve the client (user) experience
- we should just document this
I think I'm probably fine with the latter. @MarSik , thoughts?
Reporting a traceback during testing:
Traceback (most recent call last):
File "/usr/sbin/tuned", line 98, in <module>
app.run(args.daemon)
File "/usr/lib/python3.6/site-packages/tuned/daemon/application.py", line 214, in run
result = self._controller.run()
File "/usr/lib/python3.6/site-packages/tuned/daemon/controller.py", line 65, in run
exports.period_check()
File "/usr/lib/python3.6/site-packages/tuned/exports/__init__.py", line 46, in period_check
return ctl.period_check()
File "/usr/lib/python3.6/site-packages/tuned/exports/controller.py", line 54, in period_check
exporter.period_check()
File "/usr/lib/python3.6/site-packages/tuned/exports/unix_socket_exporter.py", line 162, in period_check
r, _, _ = select.select([self._socket_object], (), (), 0)
TypeError: argument must be an int, or have a fileno() method.
So far I've seen it only once
Thank you for the latest commit, Honzo! This is great progress not having to use the payload length and just:
printf '["active_profile"]' | nc -U /run/tuned/tuned.sock # or printf '["active_profile"]' | socat - UNIX-CONNECT:/run/tuned/tuned.sockThe default
socatcall often fails to output any result (withWARNING tuned.exports.unix_socket_exporter: Failed send data '<profile-name>' type 'res': [Errno 32] Broken pipein TuneD logs). This is due to client disconnects, wherencseems to wait (longer?) for a response, whereas the default 0.5s timeout forsocatis just not sufficient for this server-side implementation.printf '["active_profile"]' | socat -t5 - UNIX-CONNECT:/run/tuned/tuned.sockseems to work every time.
Testing against
HAProxyimplementation I had no such issues. So at this point I'm wondering whether:* it is worth/possible/practical improving the server-side (TuneD) implementation to improve the client (user) experience * we should just document thisI think I'm probably fine with the latter. @MarSik , thoughts?
Can add different logging, but I don't think socket allow to check if other side is still connected.
Reporting a traceback during testing:
Traceback (most recent call last): File "/usr/sbin/tuned", line 98, in <module> app.run(args.daemon) File "/usr/lib/python3.6/site-packages/tuned/daemon/application.py", line 214, in run result = self._controller.run() File "/usr/lib/python3.6/site-packages/tuned/daemon/controller.py", line 65, in run exports.period_check() File "/usr/lib/python3.6/site-packages/tuned/exports/__init__.py", line 46, in period_check return ctl.period_check() File "/usr/lib/python3.6/site-packages/tuned/exports/controller.py", line 54, in period_check exporter.period_check() File "/usr/lib/python3.6/site-packages/tuned/exports/unix_socket_exporter.py", line 162, in period_check r, _, _ = select.select([self._socket_object], (), (), 0) TypeError: argument must be an int, or have a fileno() method.So far I've seen it only once
Could you replicate it or dis it happen again? I couldn't replicate it
So the good news is that after the latest commit I can no longer see the trace above and that the requests are being served reliably. However, one per second. Is this expected?
So the good news is that after the latest commit I can no longer see the trace above and that the requests are being served reliably. However, one per second. Is this expected?
You're right, we can do better.
... we can do better.
To give you some sense of what is expected. On a 112 CPU r750, with the default 50 QPS I created 100 pods in parallel. They were all PodScheduled within 1s delta and the average "startup" time between PodScheduled and Ready was 5.19s . I may try tweaking the QPS, but I doubt it will have much significance in reducing the startup times. I used CPUManager static policy and all containers were assigned to its own CPU.
Thank you for the changes, Honzo! I was able to test them with success, however, it needs a bit more work:
printf "DoS" | nc --send-only -U /run/tuned/tuned.sockTraceback (most recent call last): File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 220, in period_check data = json.loads(data) File "/usr/lib64/python3.10/json/__init__.py", line 346, in loads return _default_decoder.decode(s) File "/usr/lib64/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 140, in _send_data s.send(json.dumps(data).encode("utf-8")) BrokenPipeError: [Errno 32] Broken pipe During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/sbin/tuned", line 98, in <module> app.run(args.daemon) File "/usr/lib/python3.10/site-packages/tuned/daemon/application.py", line 215, in run result = self._controller.run() File "/usr/lib/python3.10/site-packages/tuned/daemon/controller.py", line 65, in run exports.period_check() File "/usr/lib/python3.10/site-packages/tuned/exports/__init__.py", line 46, in period_check return ctl.period_check() File "/usr/lib/python3.10/site-packages/tuned/exports/controller.py", line 54, in period_check exporter.period_check() File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 223, in period_check self._send_data(conn, self._create_error_responce(-32700, "Parse error", str(e))) File "/usr/lib/python3.10/site-packages/tuned/exports/unix_socket_exporter.py", line 142, in _send_data log.warning("Failed to send data '%s' type '%s': %s" % (data, e)) TypeError: not enough arguments for format string
Good catch, should be corrected