"AWS (Real)" is flaky
Buildkite link
https://buildkite.com/materialize/nightlies/builds/6448#018daa2c-f08f-405c-95d0-1fc0a0f6d7bc
Relevant log output
: {'S': 'ERROR', 'C': '58000', 'M': 'role trust policy does not require an external ID', 'D': "The trust policy for the connection's role (arn:aws:iam::400121260767:role/testdrive-3254404661-Customer) is insecure and allows any Materialize customer to assume the role.", 'H': 'See: https://materialize.com/s/aws-connection-role-trust-policy'}
Additional thoughts
For tracking if more occurrences happen.
Thanks! I'll keep this in my background queue. Please ping if you see this happen again. One easy fix here is to bump the IAM sleep from 10s to 30s. That will slow the test down quite a bit though. The slightly less easy fix is to add retry loops, so that we wait as long as necessary to see what we expect, rather than blanket waiting 10 or 30s.
Another simple option: maybe just tell Buildkite to retry this one up to three times when it fails?
Seen again in https://buildkite.com/materialize/nightlies/builds/6603#018de042-0791-4d1e-8b29-e7a3a942d71d.
Have we had trouble with this since #25553 landed?
Have we had trouble with this since #25553 landed?
That change landed on Feb 26. Since then, it has failed three times (twice due to this error in builds/6761 and builds/7159, and once in builds/6755 in which most jobs failed) but it recovered each time.
We can consider this resolved.
Great!