Cook
Cook copied to clipboard
test_balanced_host_constraint_cannot_place is flaky
We've been seeing repeated failure of this test. I think Paul has already determined that this is due to some broken logic in the constraints handling code.
======================================================================
ERROR: test_balanced_host_constraint_cannot_place (tests.cook.test_basic.CookTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/build/twosigma/Cook/integration/tests/cook/test_basic.py", line 1338, in test_balanced_host_constraint_cannot_place
placement_reasons = util.wait_until(query_unscheduled, lambda r: len(r) > 0)
File "/home/travis/build/twosigma/Cook/integration/tests/cook/util.py", line 242, in wait_until
return wait_until_inner()
File "/home/travis/build/twosigma/Cook/integration/.eggs/retrying-1.3.3-py3.6.egg/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/home/travis/build/twosigma/Cook/integration/.eggs/retrying-1.3.3-py3.6.egg/retrying.py", line 212, in call
raise attempt.get()
File "/home/travis/build/twosigma/Cook/integration/.eggs/retrying-1.3.3-py3.6.egg/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/home/travis/.pyenv/versions/3.6.2/lib/python3.6/site-packages/six-1.11.0-py3.6.egg/six.py", line 693, in reraise
raise value
File "/home/travis/build/twosigma/Cook/integration/.eggs/retrying-1.3.3-py3.6.egg/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/home/travis/build/twosigma/Cook/integration/tests/cook/util.py", line 236, in wait_until_inner
raise RuntimeError(error_msg)
RuntimeError: wait_until condition not yet met, retrying...
-------------------- >> begin captured logging << --------------------
tests.cook.util: INFO: Using cook url http://localhost:12321
tests.cook.util: INFO: Using cook url http://localhost:12321
tests.cook.util: DEBUG: Waiting for connection to cook...
urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): localhost
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET / HTTP/1.1" 404 None
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /settings HTTP/1.1" 200 None
urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 172.17.0.4
urllib3.connectionpool: DEBUG: http://172.17.0.4:5050 "GET /redirect HTTP/1.1" 307 0
tests.cook.util: INFO: Using mesos url http://172.17.0.4:5050
tests.cook.util: DEBUG: Waiting for connection to cook...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET / HTTP/1.1" 404 None
urllib3.connectionpool: DEBUG: http://172.17.0.4:5050 "GET /state.json HTTP/1.1" 200 1476
tests.cook.util: INFO: {'jobs': [{'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': '50a71b99-3129-4c9c-8dd6-21a2656eef30', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}, {'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': '721c1e50-8441-4722-9fac-bcd0831e28f4', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}, {'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': '586d4943-535a-4e5a-a50a-4bbb6336c2bd', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}, {'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': 'd30315cf-ff65-4bcf-8055-46039349020f', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}, {'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': '80bc6a4b-5966-43d0-8851-af74b9039d32', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}, {'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': '254e3782-7ba0-416d-b647-8872316f56d5', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}, {'command': 'sleep 600', 'cpus': 0.1, 'max_retries': 1, 'mem': 100, 'name': 'echo', 'priority': 1, 'uuid': 'c4372a73-ac87-4c82-a309-28658e10dc9b', 'group': '5cf1a659-bf2e-42e9-9e98-1694426cad53'}], 'groups': [{'uuid': '5cf1a659-bf2e-42e9-9e98-1694426cad53', 'host-placement': {'type': 'balanced', 'parameters': {'attribute': 'HOSTNAME', 'minimum': 7}}}]}
urllib3.connectionpool: DEBUG: http://localhost:12321 "POST /rawscheduler HTTP/1.1" 201 None
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /rawscheduler?job=50a71b99-3129-4c9c-8dd6-21a2656eef30&job=721c1e50-8441-4722-9fac-bcd0831e28f4&job=586d4943-535a-4e5a-a50a-4bbb6336c2bd&job=d30315cf-ff65-4bcf-8055-46039349020f&job=80bc6a4b-5966-43d0-8851-af74b9039d32&job=254e3782-7ba0-416d-b647-8872316f56d5&job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /rawscheduler?job=50a71b99-3129-4c9c-8dd6-21a2656eef30&job=721c1e50-8441-4722-9fac-bcd0831e28f4&job=586d4943-535a-4e5a-a50a-4bbb6336c2bd&job=d30315cf-ff65-4bcf-8055-46039349020f&job=80bc6a4b-5966-43d0-8851-af74b9039d32&job=254e3782-7ba0-416d-b647-8872316f56d5&job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /rawscheduler?job=50a71b99-3129-4c9c-8dd6-21a2656eef30&job=721c1e50-8441-4722-9fac-bcd0831e28f4&job=586d4943-535a-4e5a-a50a-4bbb6336c2bd&job=d30315cf-ff65-4bcf-8055-46039349020f&job=80bc6a4b-5966-43d0-8851-af74b9039d32&job=254e3782-7ba0-416d-b647-8872316f56d5&job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /rawscheduler?job=50a71b99-3129-4c9c-8dd6-21a2656eef30&job=721c1e50-8441-4722-9fac-bcd0831e28f4&job=586d4943-535a-4e5a-a50a-4bbb6336c2bd&job=d30315cf-ff65-4bcf-8055-46039349020f&job=80bc6a4b-5966-43d0-8851-af74b9039d32&job=254e3782-7ba0-416d-b647-8872316f56d5&job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /rawscheduler?job=50a71b99-3129-4c9c-8dd6-21a2656eef30&job=721c1e50-8441-4722-9fac-bcd0831e28f4&job=586d4943-535a-4e5a-a50a-4bbb6336c2bd&job=d30315cf-ff65-4bcf-8055-46039349020f&job=80bc6a4b-5966-43d0-8851-af74b9039d32&job=254e3782-7ba0-416d-b647-8872316f56d5&job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.util: INFO: wait_until condition satisfied
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /unscheduled_jobs?job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.test_basic: INFO: unscheduled_jobs response: {'uuid': 'c4372a73-ac87-4c82-a309-28658e10dc9b', 'reasons': [{'reason': 'You have 35 other jobs ahead in the queue.', 'data': {'jobs': ['0f2d9ccc-37ce-4710-aa09-aa474fb34b33', '85164d11-ee3e-4929-90b6-616f0681f7f2', '93307ed1-951c-4033-961b-cabefeba63e1', 'f5df7fee-8cfc-4365-8c26-11ed022a3036', 'f6c018e4-9274-4743-b8b7-dbdd66ee2e8d', '058e65bc-571e-4f20-afde-4346f4d979b6', '3b040390-5b77-4762-b193-517db46baa4c', '71a1b772-2c3e-4457-ad7a-bf4e19bb245c', '721c1e50-8441-4722-9fac-bcd0831e28f4', 'd6f07f28-80fd-4aa0-a4a8-ee5e11c23e1e']}}, {'reason': 'The job is now under investigation. Check back in a minute for more details!', 'data': {}}]}
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /unscheduled_jobs?job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.test_basic: INFO: unscheduled_jobs response: {'uuid': 'c4372a73-ac87-4c82-a309-28658e10dc9b', 'reasons': [{'reason': 'The job is running now.', 'data': {}}]}
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
...
tests.cook.util: DEBUG: wait_until condition not yet met, retrying...
urllib3.connectionpool: DEBUG: http://localhost:12321 "GET /unscheduled_jobs?job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 200 None
tests.cook.test_basic: INFO: unscheduled_jobs response: {'uuid': 'c4372a73-ac87-4c82-a309-28658e10dc9b', 'reasons': [{'reason': 'The job is running now.', 'data': {}}]}
tests.cook.util: INFO: Timeout exceeded waiting for condition. Details: []
urllib3.connectionpool: DEBUG: http://localhost:12321 "DELETE /rawscheduler?job=50a71b99-3129-4c9c-8dd6-21a2656eef30&job=721c1e50-8441-4722-9fac-bcd0831e28f4&job=586d4943-535a-4e5a-a50a-4bbb6336c2bd&job=d30315cf-ff65-4bcf-8055-46039349020f&job=80bc6a4b-5966-43d0-8851-af74b9039d32&job=254e3782-7ba0-416d-b647-8872316f56d5&job=c4372a73-ac87-4c82-a309-28658e10dc9b HTTP/1.1" 204 0
--------------------- >> end captured logging << ---------------------