java-driver Fix uncaught exception during graceful channel shutdown after exceeding max orphan ids

Hi, this PR fixes the warning log spam when cancelling queries after the maximum orphan limit has been reached.

2024-06-04 05:36:38 UTC WARNING [<server,0xe>] [<<com.datastax.oss.driver.internal.core.util>>, UncaughtExceptions] Uncaught exception in scheduled task
java.lang.IllegalStateException: complete already: DefaultChannelPromise@750fc995(success)
	at io.netty.util.concurrent.DefaultPromise.setSuccess(DefaultPromise.java:100)
	at io.netty.channel.DefaultChannelPromise.setSuccess(DefaultChannelPromise.java:78)
	at io.netty.channel.DefaultChannelPromise.setSuccess(DefaultChannelPromise.java:73)
	at com.datastax.oss.driver.internal.core.channel.InFlightHandler.startGracefulShutdown(InFlightHandler.java:207)
	at com.datastax.oss.driver.internal.core.channel.InFlightHandler.cancel(InFlightHandler.java:186)
	at com.datastax.oss.driver.internal.core.channel.InFlightHandler.write(InFlightHandler.java:110)
	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:879)
	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:863)
	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:968)
	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:856)
	at io.netty.channel.DefaultChannelPipeline.write(DefaultChannelPipeline.java:1015)
	at io.netty.channel.AbstractChannel.write(AbstractChannel.java:301)
	at com.datastax.oss.driver.internal.core.channel.DefaultWriteCoalescer$Flusher.runOnEventLoop(DefaultWriteCoalescer.java:100)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Unknown Source)

Jun 06 '24 11:06 christianaistleitner

Is there still something that I need to do to get this PR merged? 🙂

Jul 11 '24 08:07 christianaistleitner

@tolbertam Apologies for the ping, but can we please merge this PR? It would be really helpful to have it included in the next release.

Although we increased advanced.connection.max-orphan-requests significantly as a temporary workaround fix, we’re still frequently encountering the warning log and as a result, I’m repeatedly receiving bug tickets to (again) investigate this issue.

Sep 18 '24 08:09 christianaistleitner

Apologies for not following up on this, let me see if I can get another +1 today and we can get this merged!

Sep 18 '24 13:09 tolbertam

Agree with @tolbertam, nice find @christianaistleitner! It took me a minute to walk through exactly which scenario you were aiming to fix here but once it became clear I agree that this is an elegant way to address this behaviour.

I've got a Jenkins run going now to make sure there aren't any unexpected regressions (which I very much do not expect). Once that's complete this will get a +1 from me as well.

Sep 18 '24 21:09 absurdfarce

Jenkins run looks good: only a few test failures and all from either environmental issues or known flakey tests. We're all set here.

Sep 18 '24 23:09 absurdfarce

amended the commit to include the review information and proceeding with the merge. Thank you @christianaistleitner!

Sep 19 '24 01:09 tolbertam