james-project icon indicating copy to clipboard operation
james-project copied to clipboard

JAMES-3929 Using http5 client instead of opensearch rest client

Open Arsnael opened this issue 2 years ago • 13 comments

Arsnael avatar Jul 18 '23 10:07 Arsnael

https://ci-builds.apache.org/job/james/job/ApacheJames/job/opensearch-upgrade/1/testReport/junit/org.apache.james.backends.opensearch/OpenSearchIndexerTest/updateMessages/

Seems to be the same issue that @vttranlina encountered... Weird cause when I tested locally quickly yesterday it seemed ok. I will dig more today

Arsnael avatar Jul 19 '23 02:07 Arsnael

Hmmmm seems hanging forever same in some other tests in OpenSearchListeningMessageSearchIndexTest. As soon as you put OS on pause, the requests hangs on forever. Will check the client if we need to configure the timeout or something, related likely to the new Transport class

Arsnael avatar Jul 20 '23 02:07 Arsnael

Exception in thread "OpenSearch-driver-3" java.lang.OutOfMemoryError: Java heap space
	at org.apache.hc.core5.util.ByteArrayBuffer.<init>(ByteArrayBuffer.java:54)
	at org.opensearch.client.transport.httpclient5.internal.HeapBufferedAsyncEntityConsumer.data(HeapBufferedAsyncEntityConsumer.java:91)
	at org.apache.hc.core5.http.nio.entity.AbstractBinDataConsumer.consume(AbstractBinDataConsumer.java:75)
	at org.apache.hc.core5.http.nio.support.AbstractAsyncResponseConsumer.consume(AbstractAsyncResponseConsumer.java:141)
	at org.apache.hc.client5.http.impl.async.HttpAsyncMainClientExec$1.consume(HttpAsyncMainClientExec.java:227)
	at org.apache.hc.core5.http.impl.nio.ClientHttp1StreamHandler.consumeData(ClientHttp1StreamHandler.java:265)
	at org.apache.hc.core5.http.impl.nio.ClientHttp1StreamDuplexer.consumeData(ClientHttp1StreamDuplexer.java:354)
	at org.apache.hc.core5.http.impl.nio.AbstractHttp1StreamDuplexer.onInput(AbstractHttp1StreamDuplexer.java:325)
	at org.apache.hc.core5.http.impl.nio.AbstractHttp1IOEventHandler.inputReady(AbstractHttp1IOEventHandler.java:64)
	at org.apache.hc.core5.http.impl.nio.ClientHttp1IOEventHandler.inputReady(ClientHttp1IOEventHandler.java:41)
	at org.apache.hc.core5.reactor.InternalDataChannel.onIOEvent(InternalDataChannel.java:133)
	at org.apache.hc.core5.reactor.InternalChannel.handleIOEvent(InternalChannel.java:51)
	at org.apache.hc.core5.reactor.SingleCoreIOReactor.processEvents(SingleCoreIOReactor.java:178)
	at org.apache.hc.core5.reactor.SingleCoreIOReactor.doExecute(SingleCoreIOReactor.java:127)
	at org.apache.hc.core5.reactor.AbstractSingleCoreIOReactor.execute(AbstractSingleCoreIOReactor.java:85)
	at org.apache.hc.core5.reactor.IOReactorWorker.run(IOReactorWorker.java:44)
	at java.base/java.lang.Thread.run(Thread.java:829)

Likely related, and a nice problem to debug!

At your flame graphs / visualvm dumps!

chibenwa avatar Jul 21 '23 02:07 chibenwa

https://ci-builds.apache.org/job/james/job/ApacheJames/job/PR-1648/4/testReport/junit/org.apache.james/WithCassandraBlobStoreTest/oneHundredMailsShouldBeWellReceived_GuiceJamesServer_/

chibenwa avatar Jul 21 '23 02:07 chibenwa

Exception in thread "OpenSearch-driver-3" java.lang.OutOfMemoryError: Java heap space
	at org.apache.hc.core5.util.ByteArrayBuffer.<init>(ByteArrayBuffer.java:54)
	at org.opensearch.client.transport.httpclient5.internal.HeapBufferedAsyncEntityConsumer.data(HeapBufferedAsyncEntityConsumer.java:91)
	at org.apache.hc.core5.http.nio.entity.AbstractBinDataConsumer.consume(AbstractBinDataConsumer.java:75)
	at org.apache.hc.core5.http.nio.support.AbstractAsyncResponseConsumer.consume(AbstractAsyncResponseConsumer.java:141)
	at org.apache.hc.client5.http.impl.async.HttpAsyncMainClientExec$1.consume(HttpAsyncMainClientExec.java:227)
	at org.apache.hc.core5.http.impl.nio.ClientHttp1StreamHandler.consumeData(ClientHttp1StreamHandler.java:265)
	at org.apache.hc.core5.http.impl.nio.ClientHttp1StreamDuplexer.consumeData(ClientHttp1StreamDuplexer.java:354)
	at org.apache.hc.core5.http.impl.nio.AbstractHttp1StreamDuplexer.onInput(AbstractHttp1StreamDuplexer.java:325)
	at org.apache.hc.core5.http.impl.nio.AbstractHttp1IOEventHandler.inputReady(AbstractHttp1IOEventHandler.java:64)
	at org.apache.hc.core5.http.impl.nio.ClientHttp1IOEventHandler.inputReady(ClientHttp1IOEventHandler.java:41)
	at org.apache.hc.core5.reactor.InternalDataChannel.onIOEvent(InternalDataChannel.java:133)
	at org.apache.hc.core5.reactor.InternalChannel.handleIOEvent(InternalChannel.java:51)
	at org.apache.hc.core5.reactor.SingleCoreIOReactor.processEvents(SingleCoreIOReactor.java:178)
	at org.apache.hc.core5.reactor.SingleCoreIOReactor.doExecute(SingleCoreIOReactor.java:127)
	at org.apache.hc.core5.reactor.AbstractSingleCoreIOReactor.execute(AbstractSingleCoreIOReactor.java:85)
	at org.apache.hc.core5.reactor.IOReactorWorker.run(IOReactorWorker.java:44)
	at java.base/java.lang.Thread.run(Thread.java:829)

Likely related, and a nice problem to debug!

At your flame graphs / visualvm dumps!

I guessed with what I saw with perf tests yesterday... I had 9000+ users (10k users total) hanging at the end of the imap simulation...

I wanted to do some dumps and graphs today but well... can't access the perf test VM for now

Arsnael avatar Jul 21 '23 02:07 Arsnael

hanging at the end of the imap simulation...

not related but don't forget the MAX_DURATION env variable

vttranlina avatar Jul 21 '23 04:07 vttranlina

hanging at the end of the imap simulation...

not related but don't forget the MAX_DURATION env variable

I didn't and the simulation finished because of it. But when before finishing you see more than 9k users are still hanging, you know something is wrong :)

Arsnael avatar Jul 21 '23 07:07 Arsnael

What is the status of this work?

chibenwa avatar Aug 16 '23 21:08 chibenwa

What is the status of this work?

Couldn't find a way to solve it

Arsnael avatar Aug 17 '23 03:08 Arsnael

Hoping that the lib update fixes the previous encountered issues.

I didn't seem to encounter anymore the issue we had before on OpenSearchListeningMessageSearchIndexTest when putting OS on pause.

Still needs of course to be perf tested before being potentially approved/merged.

Arsnael avatar Feb 05 '24 09:02 Arsnael

Green... I guess it's time to see the perf side of things :)

Arsnael avatar Feb 07 '24 02:02 Arsnael

Don't forget to pick a scenario where OpenSearch is actually called...

chibenwa avatar Feb 07 '24 05:02 chibenwa

Don't forget to pick a scenario where OpenSearch is actually called...

The full platform scenarii actually do call them... among other things. Thinking it's good enough for a first run at it.

Have something more specific in mind maybe to push it further on a 2nd test run?

Arsnael avatar Feb 07 '24 06:02 Arsnael