java-client icon indicating copy to clipboard operation
java-client copied to clipboard

Client creates a thread that stops graceful application shutdown

Open checketts opened this issue 11 months ago • 12 comments

When restarting my application (for local development) adding the SplitFactoryBuilder adds a thread or something that blocks graceful shotdown.

    final val splitFactory: SplitFactory = SplitFactoryBuilder.build(apiKey, config)

This impact local development since the JVM doesn't completely stop, making restarting require a kill command.

I'm using Java Client 4.14.0 and I have not set any sort of 'disable shutdown' setting. Here is my complete code that I can comment out and return to the correct graceful shutdown:

    final val config: SplitClientConfig = SplitClientConfig.builder()
        .setBlockUntilReadyTimeout(10000)
        .build()
    final val splitFactory: SplitFactory = SplitFactoryBuilder.build(apiKey, config)

I suspect the call Thread.interrupted() in SyncManagerImpl is the problem (it should likely be Thread.currentThread().isInterrupted())

Logs:

21:31:11.620Z  INFO 23824 --- [ionShutdownHook] io.split.client.SplitFactoryImpl         : Shutdown called for split
2025-01-31T21:31:11.620Z  INFO 23824 --- [ionShutdownHook] io.split.engine.common.SyncManagerImp    : Shutting down SyncManagerImp
2025-01-31T21:31:11.620Z DEBUG 23824 --- [ionShutdownHook] io.split.engine.common.Synchronizer      : Stop Periodic Fetching ...
2025-01-31T21:31:11.620Z DEBUG 23824 --- [ionShutdownHook] io.split.engine.common.PushManager       : Stopping PushManagerImp
2025-01-31T21:31:11.621Z  INFO 23824 --- [ionShutdownHook] io.split.engine.sse.EventSourceClient    : Stopping EventSourceClientImp
2025-01-31T21:31:11.622Z DEBUG 23824 --- [StatusMonitor-0] io.split.engine.common.SyncManagerImp    : Streaming status received: STREAMING_DOWN
2025-01-31T21:31:11.622Z  INFO 23824 --- [StatusMonitor-0] io.split.engine.common.SyncManagerImp    : Streaming service temporarily unavailable, working in polling mode.
2025-01-31T21:31:11.622Z DEBUG 23824 --- [StatusMonitor-0] io.split.engine.common.Synchronizer      : Starting Periodic Fetching ...

checketts avatar Jan 31 '25 21:01 checketts

It appears that something is hanging in the destroy() shutdown logic. So disableDestroyOnShutDown() solved it for me, though I would like destroy to work correctly..

checketts avatar Feb 01 '25 17:02 checketts

Hi @checketts, If you are running the application in local mode, you need to set the splitFile configuration. Here’s a code example:

SplitClientConfig config = SplitClientConfig.builder() .splitFile("parentRoot/featureFlags.json") .segmentDirectory("parentRoot/segments") .setBlockUntilReadyTimeout(10000) .build();

Let me know if this resolves your issue.

Thanks,

Nadia

nmayorsplit avatar Feb 04 '25 14:02 nmayorsplit

I'm not using local mode. And the client is starting fine and serving up flags correctly. Just the destroy method seems to hang and make shutdown not work correctly.

checketts avatar Feb 04 '25 14:02 checketts

Hi @checketts, thanks for the info, I cannot repro on my end, I do see periodic fetching starts after shutting down streaming, but it gets stopped immediately, see below. Could you please provide the complete debug log that shows activities after the shutdown is completed?

 Streaming service temporarily unavailable, working in polling mode.
2025-02-11 12:05:56,313 DEBUG - The thread was stopped.
2025-02-11 12:05:56,313 DEBUG - The thread was stopped.
2025-02-11 12:05:56,313 DEBUG - Feature flags Worker not running.
2025-02-11 12:05:56,313 DEBUG - Segments Worked stopped.
2025-02-11 12:05:56,314 DEBUG - Cancel nextTokenRefreshTask
2025-02-11 12:05:56,314 DEBUG - SSEClient finished.
2025-02-11 12:05:56,314 DEBUG - Segments Worker not running.
2025-02-11 12:05:56,314 DEBUG - Starting Periodic Fetching ...
2025-02-11 12:05:56,314 DEBUG - Starting PeriodicFetching Feature flags ...
2025-02-11 12:05:56,314 INFO  - Successful shutdown of segment fetchers
2025-02-11 12:05:56,315 DEBUG - Splits PeriodicFetching not running...

chillaq avatar Feb 11 '25 20:02 chillaq

OK I'll try to reproduce it in a minimal Spring application to ensure there isn't something else in my project causing trouble

checketts avatar Feb 11 '25 20:02 checketts

Hi @checketts, no need to send us logs, I will simply add a check before starting the periodic sync if a shutdown is in progress, that should fix the isssue.

Will provide the rc build once it is ready.

Thanks Bilal

chillaq avatar Feb 11 '25 21:02 chillaq

Hi @checketts, i released version 4.14.1-rc1, could you please test it and let us know if it fixes the issue?

Thanks Bilal

chillaq avatar Feb 12 '25 20:02 chillaq

I can't seem to pull down the rc. I also don't see it in maven central yet: https://mvnrepository.com/artifact/io.split.client/java-client

Should I wait a little and try again?

checketts avatar Feb 12 '25 20:02 checketts

maven central will show it after few hours, but your maven should still be able to fetch it, only the java-client module is released though, here is the pom entry

<dependency>
        <groupId>io.split.client</groupId>
        <artifactId>java-client</artifactId>
        <version>4.14.1-rc1</version>
    </dependency>

chillaq avatar Feb 12 '25 20:02 chillaq

Can confirm that our organization is also seeing this issue and that I am unable to download 4.14.1-rc1 to verify if the issue is resolved.

chris407etr avatar Feb 21 '25 21:02 chris407etr

Hi @checketts and @chris407etr

The rc version is: 4.14.0-rc2

Sorry about the confusion

Thanks Bilal

chillaq avatar Mar 07 '25 20:03 chillaq

Had the same problem after updating to 4.14

When will the fix get a release? This caused lots of problems and also costs, please add an IT for proper shutdown :D

Dustinhoefer avatar Apr 03 '25 07:04 Dustinhoefer

Same issue, we will roll back to an earlier SDK version. Relevant thread dump of a hung application that is blocked from shutdown, using 4.16.0:

"split-destroy-worker" #122 [171787] prio=5 os_prio=31 cpu=2.90ms elapsed=172.73s tid=0x00000001563cac00 nid=171787 waiting on condition  [0x000000033974d000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
        - parking to wait for  <0x000000064c528b20> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:221)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:754)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:990)
        at java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153)
        at java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322)
        at sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:1039)
        at split.org.apache.hc.core5.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:149)
        at split.org.apache.hc.core5.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
        at split.org.apache.hc.core5.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:262)
        at split.org.apache.hc.core5.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:223)
        at split.org.apache.hc.core5.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:147)
        at split.org.apache.hc.core5.http.impl.io.ChunkedInputStream.close(ChunkedInputStream.java:315)
        at split.org.apache.hc.core5.io.Closer.close(Closer.java:48)
        at split.org.apache.hc.core5.http.impl.io.IncomingHttpEntity.close(IncomingHttpEntity.java:111)
        at split.org.apache.hc.core5.http.io.entity.HttpEntityWrapper.close(HttpEntityWrapper.java:120)
        at split.org.apache.hc.client5.http.impl.classic.ResponseEntityProxy.close(ResponseEntityProxy.java:180)
        at split.org.apache.hc.core5.io.Closer.close(Closer.java:48)
        at split.org.apache.hc.core5.http.message.BasicClassicHttpResponse.close(BasicClassicHttpResponse.java:93)
        at split.org.apache.hc.client5.http.impl.classic.CloseableHttpResponse.doClose(CloseableHttpResponse.java:207)
        at split.org.apache.hc.client5.http.impl.classic.CloseableHttpResponse.close(CloseableHttpResponse.java:228)
        at io.split.engine.sse.client.SSEClient.close(SSEClient.java:125)
        at io.split.engine.sse.EventSourceClientImp.stop(EventSourceClientImp.java:101)
        at io.split.engine.common.PushManagerImp.cleanUpResources(PushManagerImp.java:170)
        at io.split.engine.common.PushManagerImp.stop(PushManagerImp.java:131)
        at io.split.engine.common.SyncManagerImp.shutdown(SyncManagerImp.java:174)
        at io.split.client.SplitFactoryImpl.destroy(SplitFactoryImpl.java:499)
        - locked <0x000000064562f7a0> (a io.split.client.SplitFactoryImpl)
        at io.split.client.SplitFactoryImpl.lambda$new$0(SplitFactoryImpl.java:288)
        at io.split.client.SplitFactoryImpl$$Lambda$5832/0x0000000802796f70.run(Unknown Source)
        at java.lang.Thread.runWith([email protected]/Thread.java:1636)
        at java.lang.Thread.run([email protected]/Thread.java:1623)

"SPLIT-SSEConnection-0" #136 [70147] daemon prio=5 os_prio=31 cpu=15.70ms elapsed=306.28s tid=0x000000015001ca00 nid=70147 waiting on condition  [0x000000032479e000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
        - parking to wait for  <0x0000000643602130> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
        at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
        at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1707)
        at java.util.concurrent.LinkedBlockingQueue.take([email protected]/LinkedBlockingQueue.java:435)
        at java.util.concurrent.ThreadPoolExecutor.getTask([email protected]/ThreadPoolExecutor.java:1070)
        at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
        at java.lang.Thread.runWith([email protected]/Thread.java:1636)
        at java.lang.Thread.run([email protected]/Thread.java:1623)

   Locked ownable synchronizers:
        - <0x000000064c528b20> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

adrian-skybaker avatar Jul 11 '25 06:07 adrian-skybaker

Hi @adrian-skybaker and @checketts, We released Java SDK 4.18.1 which fixes a hanging thread when shutting down the Streaming client, this potentially can cause this issue as well, I can see it in the thread dump.

Could you please test this version and let us know if it fix the issue?

Thanks Bilal

chillaq avatar Sep 30 '25 21:09 chillaq

Hi @chris407etr and @Dustinhoefer , please see previous message, would love to get your feedback if newest SDK release 4.18.1 fixes this issue.

Thanks Bilal

chillaq avatar Oct 02 '25 16:10 chillaq