kafka icon indicating copy to clipboard operation
kafka copied to clipboard

MINOR: Adds KRaft versions of most streams system tests

Open AlanConfluent opened this issue 3 years ago • 4 comments

Adds the annotation @matrix(metadata_quorum=quorum.all_non_upgrade) to many existing tests, which runs them with each of zookeeper and remote_kraft nodes.

This skips tests which use various forms of Kafka versioning since those seem to have issues with KRaft at the moment. Running these tests with KRaft will require a followup PR.

Committer Checklist (excluded from commit message)

  • [ ] Verify design and implementation
  • [ ] Verify test coverage and CI build status
  • [ ] Verify documentation (including upgrade notes)

AlanConfluent avatar Jul 29 '22 00:07 AlanConfluent

In addition to addressing the review comments, can you post a link to the system test results with this change? It would be good to verify the impact before merging.

+1. I'd also love to learn how much system test time increase this one would incur.

guozhangwang avatar Aug 04 '22 00:08 guozhangwang

In addition to addressing the review comments, can you post a link to the system test results with this change? It would be good to verify the impact before merging.

+1. I'd also love to learn how much system test time increase this one would incur.

My run with my changes is here: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/5067/

I'm trying to run the same tests on trunk and see, but there are unrelated failures. Will update when I get a run.

AlanConfluent avatar Aug 04 '22 23:08 AlanConfluent

Thanks @AlanConfluent ! A few meta questions:

  1. Seems the streams_cooperative_rebalance_upgrade_test.py is not included in this PR?

I didn't update any of the upgrade tests just yet. I was having a hard time running against different versions of Kafka and using KRaft -- this seems like a reasonable followup, so I didn't do it yet.

  1. I think for streams_application_upgrade_test.py we should also consider enabling kraft on the servers, to make sure that kraft works when streams itself upgrade.

I agree. This will be a followup.

  1. In streams_broker_compatibility_test.py when we test for broker versions > 3.1 we should also allow it to be kraft.

Maybe this was the main issue I was running into. I thought KRaft was available in earlier versions, but saw odd failures. I'll talk to you about what it would take to get this working.

  1. This is not related to this PR, but it seems the test coverage for streams_application_ugprade_test and streams_upgrade_test have much overlaps. @vvcephei could you chime in here since you have much experience with the former class file. Could we dedup their coverage hence reduce our test time?
  1. If the current e2e time is too high, I feel maybe we can skip adding the kraft model for the following suite: a. shutdown_deadlock b. relational_smoke, since its dependency on kraft is exactly covered by smoke -- i.e. if there's an issue with relational_smoke due to kraft, then smoke itself should fail as well. c. named_repartition Good to consider. Let's evaluate after I get a successful run of trunk.

AlanConfluent avatar Aug 04 '22 23:08 AlanConfluent

Re-triggered the jenkins build.

guozhangwang avatar Aug 09 '22 23:08 guozhangwang

I made the change to switch some of these tests to just run with remote_kraft to minimize test run time.

AlanConfluent avatar Aug 25 '22 20:08 AlanConfluent

It looks like you have a green run: http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1661474350--AlanConfluent--updates_tests_kraft--9be74f3d1/2022-08-25--001./2022-08-25--001./report.html

vvcephei avatar Aug 26 '22 21:08 vvcephei

I think this was the ultimate run which covered everything under streams: http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1661556003--AlanConfluent--updates_tests_kraft--9be74f3d1/2022-08-26--001./2022-08-26--001./report.html

AlanConfluent avatar Aug 26 '22 23:08 AlanConfluent