vert.x icon indicating copy to clipboard operation
vert.x copied to clipboard

Noisy logging from TCPServerBase#listen during startup

Open MikeEdgar opened this issue 3 years ago • 9 comments

Version

  • 4.1.7 (via Quarkus 2.2.5)
  • 4.2.4 (via Quarkus 2.7.5)

Context

In a resource-constrained environment (e.g. Kubernetes with limited a CPU request/limit), the start-up of the HTTP server results in multiple blocked thread stack traces being logged. When deploying both TLS and plain text hosts this increases as well.

2022-07-15 11:41:54,804 WARN  [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 2695 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked
	at io.vertx.core.net.impl.TCPServerBase.listen(TCPServerBase.java:125)
	at io.vertx.core.net.impl.TCPServerBase.bind(TCPServerBase.java:100)
	at io.vertx.core.http.impl.HttpServerImpl.listen(HttpServerImpl.java:217)
	at io.vertx.core.http.impl.HttpServerImpl.listen(HttpServerImpl.java:149)
	at io.vertx.core.http.impl.HttpServerImpl.listen(HttpServerImpl.java:154)
	at io.quarkus.vertx.http.runtime.VertxHttpRecorder$WebDeploymentVerticle.setupTcpHttpServer(VertxHttpRecorder.java:1075)
	at io.quarkus.vertx.http.runtime.VertxHttpRecorder$WebDeploymentVerticle.start(VertxHttpRecorder.java:1029)

Scenario is also referenced in #1379

Extra

Given the likelihood of this situation occurring and producing log noise, it seems it may be desirable to execute listen using internal worker threads. I've verified this works with my application [1], but would like to discuss here if this approach is correct more broadly within Vert.x.

[1] https://github.com/eclipse-vertx/vert.x/compare/master...MikeEdgar:vert.x:tcpserverbase-listen-blocking?expand=1

MikeEdgar avatar Jul 15 '22 19:07 MikeEdgar

binding a server is a non blocking interaction with the system, can you elaborate what do you think make it blocking ?

vietj avatar Jul 17 '22 18:07 vietj

Hi @vietj , what led me to believe it was blocking were stack traces like the following. This is from version 4.1.7 in Quarkus 2.2.5, but I don't think it's fundamentally changed in more recent versions. I'll be able to get an updated stack trace tomorrow if necessary.

2022-07-16 23:32:09,249 WARN  [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-0,5,main] has been blocked for 3399 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked
	at [email protected]/sun.security.util.DerInputStream.getByte(DerInputStream.java:551)
	at [email protected]/sun.security.util.ObjectIdentifier.<init>(ObjectIdentifier.java:284)
	at [email protected]/sun.security.util.DerInputStream.getOID(DerInputStream.java:323)
	at [email protected]/sun.security.x509.AVA.<init>(AVA.java:602)
	at [email protected]/sun.security.x509.RDN.<init>(RDN.java:245)
	at [email protected]/sun.security.x509.X500Name.parseDER(X500Name.java:797)
	at [email protected]/sun.security.x509.X500Name.<init>(X500Name.java:307)
	at [email protected]/sun.security.x509.X509CertInfo.parse(X509CertInfo.java:666)
	at [email protected]/sun.security.x509.X509CertInfo.<init>(X509CertInfo.java:169)
	at [email protected]/sun.security.x509.X509CertImpl.parse(X509CertImpl.java:1842)
	at [email protected]/sun.security.x509.X509CertImpl.<init>(X509CertImpl.java:194)
	at [email protected]/sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:105)
	at [email protected]/java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:355)
	at [email protected]/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:767)
	at [email protected]/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:243)
	at [email protected]/java.security.KeyStore.load(KeyStore.java:1479)
	at [email protected]/sun.security.ssl.TrustStoreManager$TrustAnchorManager.loadKeyStore(TrustStoreManager.java:365)
	at [email protected]/sun.security.ssl.TrustStoreManager$TrustAnchorManager.getTrustedCerts(TrustStoreManager.java:313)
	at [email protected]/sun.security.ssl.TrustStoreManager.getTrustedCerts(TrustStoreManager.java:55)
	at [email protected]/sun.security.ssl.TrustManagerFactoryImpl.engineInit(TrustManagerFactoryImpl.java:49)
	at [email protected]/javax.net.ssl.TrustManagerFactory.init(TrustManagerFactory.java:278)
	at [email protected]/sun.security.ssl.SSLContextImpl.engineInit(SSLContextImpl.java:89)
	at [email protected]/javax.net.ssl.SSLContext.init(SSLContext.java:297)
	at app//io.vertx.core.net.impl.DefaultJDKCipherSuite.<clinit>(DefaultJDKCipherSuite.java:30)
	at app//io.vertx.core.net.impl.SSLHelper.createContext(SSLHelper.java:263)
	at app//io.vertx.core.net.impl.SSLHelper.getContext(SSLHelper.java:469)
	at app//io.vertx.core.net.impl.SSLHelper.getContext(SSLHelper.java:456)
	at app//io.vertx.core.net.impl.SSLHelper.validate(SSLHelper.java:494)
	at app//io.vertx.core.net.impl.TCPServerBase.listen(TCPServerBase.java:130)
	at app//io.vertx.core.http.impl.HttpServerImpl.listen(HttpServerImpl.java:220)
	at app//io.vertx.core.http.impl.HttpServerImpl.listen(HttpServerImpl.java:150)
	at app//io.vertx.core.http.impl.HttpServerImpl.listen(HttpServerImpl.java:155)
	at app//io.quarkus.vertx.http.runtime.VertxHttpRecorder$WebDeploymentVerticle.setupTcpHttpServer(VertxHttpRecorder.java:990)
	at app//io.quarkus.vertx.http.runtime.VertxHttpRecorder$WebDeploymentVerticle.start(VertxHttpRecorder.java:944)
	at app//io.vertx.core.impl.DeploymentManager.lambda$doDeploy$5(DeploymentManager.java:196)
	at app//io.vertx.core.impl.DeploymentManager$$Lambda$377/0x00000008402fd440.handle(Unknown Source)
	at app//io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:100)
	at app//io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:63)
	at app//io.vertx.core.impl.EventLoopContext.lambda$runOnContext$0(EventLoopContext.java:38)
	at app//io.vertx.core.impl.EventLoopContext$$Lambda$378/0x0000000840328040.run(Unknown Source)
	at app//io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
	at app//io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
	at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
	at app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at [email protected]/java.lang.Thread.run(Thread.java:829)

MikeEdgar avatar Jul 17 '22 19:07 MikeEdgar

I see, I think we should modify SSLHelper to have the validate operation to be executed on a worker thread then.

vietj avatar Jul 18 '22 07:07 vietj

Thanks, I'll test that out to confirm it.

MikeEdgar avatar Jul 18 '22 09:07 MikeEdgar

@vietj , updating listen to run on a worker thread (from bind) was more obvious to me since it's already returning a Future. In order to alter SSLHelper#validate to run on a worker thread, can you advise how best to accomplish that? Since the remaining code in TCPServerBase#listen presumably depends on the outcome of validate, I would think a Future needs to be returned, which would likely change the synchronization approach in listen, correct?

MikeEdgar avatar Jul 18 '22 15:07 MikeEdgar

that's correct, the SSLHelper should validation should return a future and the listen would use it to continue the internal execution of the server bind.

vietj avatar Jul 18 '22 17:07 vietj

I've had some success with the changes to SSLHelper as they relate to TCPServerBase. It's take some re-factoring of the TCPServerBase#listen method to account for handling of shared servers, I'll post a draft PR tomorrow for feedback.

I've also found that SSLHelper#validate is called from the NetClientImpl constructor, which is proving to be a bit more challenging.

MikeEdgar avatar Jul 20 '22 21:07 MikeEdgar

I think we can move validation to another method than the constructor

On Wed, Jul 20, 2022 at 11:57 PM Michael Edgar @.***> wrote:

I've had some success with the changes to SSLHelper as they relate to TCPServerBase. It's take some re-factoring of the TCPServerBase#listen method to account for handling of shared servers, I'll post a draft PR tomorrow for feedback.

I've also found that SSLHelper#validate is called from the NetClientImpl constructor, which is proving to be a bit more challenging.

— Reply to this email directly, view it on GitHub https://github.com/eclipse-vertx/vert.x/issues/4439#issuecomment-1190802643, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABXDCTDXGLZ2HBYGLHMW33VVBY47ANCNFSM53WQA53Q . You are receiving this because you were mentioned.Message ID: @.***>

vietj avatar Jul 21 '22 06:07 vietj

I opened https://github.com/eclipse-vertx/vert.x/pull/4441 for this.

MikeEdgar avatar Jul 21 '22 13:07 MikeEdgar