spring-pulsar icon indicating copy to clipboard operation
spring-pulsar copied to clipboard

Flaky ReactivePulsarListenerTombstoneTests$SingleComplexPayload causing random CI PR failures

Open onobc opened this issue 2 years ago • 10 comments

The ReactivePulsarListenerTombstoneTests$SingleComplexPayload fails randomly from time/time.

Example: https://github.com/spring-projects/spring-pulsar/actions/runs/7735972858/job/21092530057#step:4:7079

This needs to be fixed.

onobc avatar Feb 01 '24 04:02 onobc

Hi @onobc, can i work on this?

ark-tik avatar Feb 18 '24 07:02 ark-tik

Hi @KartikShrivastava ,

That would be great. The trick will be to reproduce it. I have not had luck doing so locally. Let me know if you have any questions.

onobc avatar Feb 18 '24 15:02 onobc

Hi @onobc, I too couldn't reproduce it locally. I have two hypothesis tho:

  • Messages are received in order different then the test expects
  • assertMessagesReceivedWithHeaders is called before latchWithHeaders is finished executing?

wdyt?

P.S.: Linking the CI PRs failures I found associated with this flakiness [1] [2]

ark-tik avatar Feb 25 '24 14:02 ark-tik

Hi @KartikShrivastava , Thanks for digging into this!

Those are very plausible theories. I am not opposed to adding temporary code in tests or main (logging etc..) that may help prove the points. Do you see anything (and anywhere) we can instrument to see if we can get some more info during the CI run?

onobc avatar Feb 25 '24 22:02 onobc

I will need to spend more time with this repo to comment on instrumenting more helpful info. But like you mentioned it's not a problem to add temporary code, then I can create a PR with some extra logging statements which we can use to confirm the value in receivedMessagesWithHeaders variable right before we assert it

ark-tik avatar Feb 26 '24 16:02 ark-tik

Hi @KartikShrivastava , If you do not want to continue looking into this issue I totally understand. The issues currently in the backlog are rather coarse grained and if you are looking something small to work on there is not a lot to choose from. We have been meaning to put some smaller issues in and mark them accordingly. I will try to get to that this week but in the meantime if you want me to find you something else just let me know. And there is no obligation to do anything - I just wanted to offer this in case.

Thanks

onobc avatar Feb 28 '24 20:02 onobc

Hi @onobc, apologies for not raising that small PR yet, I was planning to do it in coming weekend. My weekdays availability is most of the time very less.

P.S: Please feel free to assign any other small and low priority task you have in mind, I find this project quite interesting to explore and learn from

ark-tik avatar Feb 28 '24 22:02 ark-tik

Absolutely no worries @KartikShrivastava . No need to apologize. I just thought about the issue and that it is not the most fun thing to work on and thought you might enjoy something else. I will ping you when we have added some low priority issues.

Thanks 👍🏻

onobc avatar Feb 28 '24 23:02 onobc

Hi @onobc, I've created a PR adding one info log, please review as per your availability. Thanks!

ark-tik avatar Mar 03 '24 17:03 ark-tik

Looks like this flaky test failed in a recent build, here's the stack trace (it failed after executing logger.info(message.toString()); statement once):

ReactivePulsarListenerTombstoneTests > SingleComplexPayload > shouldReceiveMessagesWithTombstone() STANDARD_OUT
    17:24:20.410 [Test worker] INFO  org.springframework.pulsar.reactive.listener.ReactivePulsarListenerTombstoneTests$SingleComplexPayload - ReceivedMessage[payload=Foo[value=foo], keyHeader=key:foo]

ReactivePulsarListenerTombstoneTests > SingleComplexPayload > shouldReceiveMessagesWithTombstone() FAILED
    java.util.ConcurrentModificationException
        at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1013)
        at java.base/java.util.ArrayList$Itr.next(ArrayList.java:967)
        at org.springframework.pulsar.reactive.listener.ReactivePulsarListenerTombstoneTests$SingleComplexPayload.shouldReceiveMessagesWithTombstone(ReactivePulsarListenerTombstoneTests.java:249)

What could be the reason of ConcurrentModificationException?

ark-tik avatar Mar 17 '24 17:03 ark-tik