reuse address on server
start_hpp_server should reuse the port if it is already being used, most likely due to an earlier instance going down and we are restarting
File "lib64/python3.4/site-packages/prometheus_client/exposition.py", line 103, in run
httpd = HTTPServer((addr, port), MetricsHandler)
File "/usr/lib64/python3.4/socketserver.py", line 430, in __init__
self.server_bind()
File "/usr/lib64/python3.4/http/server.py", line 136, in server_bind
socketserver.TCPServer.server_bind(self)
File "/usr/lib64/python3.4/socketserver.py", line 444, in server_bind
self.socket.bind(self.server_address)
OSError: [Errno 98] Address already in use
Would you like to send a PR for this?
Run a simple server and ctl-c stopping the server, quickly, launch it again. Repeat over and over. On linux the socket will remain closed for a while, to prevent stale messages from be sent to a wrong server. In this case we want to serve new requests immediately.
See https://stackoverflow.com/questions/10705680/how-can-i-restart-a-basehttpserver-instance/10706603#10706603
Fix is below, I added a returning of the http server so it can be closed cleanly by the main loop Something like: if name == 'main': port=sys.argv[1] # Start up the server to expose the metrics. server=start_http_server(int(port)) try: process_commands() finally: server.shutdown() server.close()
exposition.py
--- exposition.py original +++ exposition.py (working copy) @@ -96,17 +96,27 @@ return
+class HttpServerReuseSocket(HTTPServer):
- """ HTTP server that will allow for the socket to be reused
-
so the server can start up again, even if there was a previous -
instance that ended abruptly. - """
- def server_bin(self):
-
HTTPServer.server_bind(self) -
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
def start_http_server(port, addr=''): """Starts a HTTP server for prometheus metrics as a daemon thread."""
- httpd = HttpServerReuseSocket((addr, port), MetricsHandler) class PrometheusMetricsServer(threading.Thread): def run(self):
-
t = PrometheusMetricsServer() t.daemon = True t.start()httpd = HTTPServer((addr, port), MetricsHandler) httpd.serve_forever()
- return httpd
Are you still open for a pull request for this issue?
I'm encountering this problem running a client under supervisord, when it tries to restart a process or tries to start a failed process repeatedly.
I would be happy to apply and test @marcpawl 's changes.
-tim
Sure.
It turns out that my particular problem is caused by an orphaned subprocess which has inherited the HTTPServer's listen file descriptor. Calling subprocess.Popen with close_fds=True avoids my problem.
https://www.python.org/dev/peps/pep-0446/#id11 indicates that Python 3.2 changed the default close_fds value from False to True. Otherwise any child processes that linger, even briefly (e.g 5 to 10 sec), can interfere with a restart.
I'm attempting to reproduce the problem as reported originally in a process without forks.
-tim
I have not been able to reproduce the problem beyond having an orphan child process keeping the HTTPServer's listen fd open.
I confirmed that the python HTTPServer already does set its listen socket REUSEADDR option since at least 2.6.0 (and back to year 2000). This is mentioned as a comment in https://stackoverflow.com/questions/10705680/how-can-i-restart-a-basehttpserver-instance/10706603#10706603. I checked the source and printed the socket option's value to confirm.
Based on your desire to avoid exposing the HTTP server implementation, I am not inclined to return the HTTPServer from start_http_server.
-tim
Instead of setting SO_REUSEADDR (which is already set), I think start_http_server should set FD_CLOEXEC, to avoid exec'd subprocesses inheriting the fd and reducing the chance of an "Address already in use" problem. I suspect that, like me, the original poster's problem was related to an inherited fd and not a TIME_WAIT.
PEP 446 - Make newly created file descriptors non-inheritable is already in Python 3.4, so this change would make client_python behavior on all versions of Python consistent.
The following code would be applied, subject to the availability of the fcntl module.
flags = fcntl.fcntl(httpd.socket, fcntl.F_GETFD)
if (flags != -1):
fcntl.fcntl(httpd.socket, fcntl.F_SETFD, flags | fcntl.FD_CLOEXEC)
It applies to exec's only and not simply forks, so it does not address child processes created by the multiprocessing module.
But it covers os.system calls and subprocess calls without specifying the close_fds parameter, and reduces the risk of introducing client_python into a application.
As an aside, another approach is to ensure that all child processes die with their parent,
rather than becoming orphaned.
Running a program under supervisord with the stopasgroup option can help with that.
-tim
Hi all, Is there any update on shutting down the httpd server?
Thank you, Yarden
Any update on this issue?
Any workarounds?
I've been living with my supervisord workaround described in my last post. I had made my proposed change in my own build, which appeared also to work, however I never took the time to submit it, as my workaround was meeting my needs. I need to consider doing that. -tim
I'm encountering this as well, and in my case, it's a process being started in a docker container - when the container is restarted, it inherits the same FD, and fails to start even though it's the only process using that port. It will continue to do this until the container is deleted or the host is rebooted. This fix is sorely needed.
For now, Im using https://stackoverflow.com/a/61591302/743188 to work around this issue
A simple fix for this issue might be to have httpd as a global that can be accessed - we are doing something similar. That way the caller can call httpd.shutdown() in a signal handler. Something like:
httpd: Optional[HTTPServer] = None
def handler(signum: int, _frame: Any) -> None:
"""gracefully shutdown servers on signal"""
signame = signal.Signals(signum).name
logger.warn(f"Signal handler called with signal {signame} ({signum})")
if httpd:
httpd.shutdown()
logger.info("Shutdown metrics server")
def start_http_server():
global httpd
httpd = HTTPServer(("", 8080), SomeHandler)
server = threading.Thread(name="metrics", target=httpd.serve_forever)
server.daemon = True
server.start()
logger.debug("Servers started")
client program:
# Set the signal handler
def __main__():
...
signal.signal(signal.SIGINT, handler) # ctrl-c
signal.signal(signal.SIGTERM, handler) # what kubernetes sends for graceful shutdown