tests/integration: set `skip_wait_for_gossip_to_settle=0`
to speed up the boot sequence of scylla nodes
we are using skip_wait_for_gossip_to_settle=0 same as we are using for quite a while in dtest on almost all tests
also introduced wait_other_notice=True for places
where starting the cluster, because without it we can get
into situation we start a test, and cluster isn't fully
ready and up.
this change shaves 1h of integration tests run, and it's now finishes in 28min.
Interesting, I remember that I did try to do this at one point, but got a lot of failures. Maybe I just made some mistake when running the tests.
Interesting, I remember that I did try to do this at one point, but got a lot of failures. Maybe I just made some mistake when running the tests.
it depends when you tried it, we (mostly @nyh) did a lot of fine tuning to ccm, to support this case correctly. while trying to figure out why that UDT test is failing, it was annoying to wait that much time for cluster creation.
I think we can merge it after CI passes
I think we can merge it after CI passes
one of the integration suite was stuck for 5h, I'm running it all again:
tests/integration/standard/test_metadata.py ss...s.............x...s.s.. [ 15%]
s...s.ss.s...x.s.x.....sssssssssss...ss.s....s.s...ss [ 20%]
Error: The operation was canceled.
I'm not sure if it's connected to this change or not, we'll need more reruns, and maybe enabling of more debug in CI to figure this one out
I think we can merge it after CI passes
one of the integration suite was stuck for 5h, I'm running it all again:
tests/integration/standard/test_metadata.py ss...s.............x...s.s.. [ 15%] s...s.ss.s...x.s.x.....sssssssssss...ss.s....s.s...ss [ 20%] Error: The operation was canceled.I'm not sure if it's connected to this change or not, we'll need more reruns, and maybe enabling of more debug in CI to figure this one out
it getting stuck also in other places, which are not this PR: https://github.com/scylladb/python-driver/actions/runs/8076169015/job/22064206623
tests/integration/standard/test_metadata.py ss...s.............x...s.s.. [ 15%]
s...s.ss.s...x.s.x.....sssssssssss...ss.s....s.s...ss [ 20%]
Error: The operation was canceled.
clearly from logs, test_connection_error is the one getting stuck, still not clear why
also seen that test_connection_honor_cluster_port leave a trail of session behind, which keep trying to reconnect to cluster that isn't' there anymore
clearly from logs,
test_connection_erroris the one getting stuck, still not clear whyalso seen that
test_connection_honor_cluster_portleave a trail of session behind, which keep trying to reconnect to cluster that isn't' there anymore
Are the problems in those tests caused by this PR? If not then I think we can merge this
clearly from logs,
test_connection_erroris the one getting stuck, still not clear whyalso seen that
test_connection_honor_cluster_portleave a trail of session behind, which keep trying to reconnect to cluster that isn't' there anymoreAre the problems in those tests caused by this PR? If not then I think we can merge this
I didn't find any connection to this change
Looks like all tests are passing now, aren't they?