daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-4891 test: Enable multiple management service replicas (#8217)

Open phender opened this issue 3 years ago • 5 comments

Updating the test harness to use 3 management service replicas when 3 or more servers are configured to run. A single management service replica will still be used for tests with 1-2 servers.

Skip-unit-tests: true Test-tag: pr daily_regression rebuild Allow-unstable-test: true

Signed-off-by: Phillip Henderson [email protected]

phender avatar Aug 03 '22 21:08 phender

Bug-tracker data: Ticket title is 'Update test harness to enable multiple management service replicas' Status is 'In Progress' Labels: 'SRS-12-0056,triaged' Job should run at elevated priority (1) https://daosio.atlassian.net/browse/DAOS-4891

github-actions[bot] avatar Aug 03 '22 21:08 github-actions[bot]

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9892/1/execution/node/1497/log

daosbuild1 avatar Aug 04 '22 20:08 daosbuild1

Test failures in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-9892/1/:

  • Functional on EL 8 / FTEST_rebuild.RbldContRfTest.6-./rebuild/container_rf.py:RbldContRfTest.test_rebuild_with_container_rf
    • https://daosio.atlassian.net/browse/DAOS-11291
  • Functional Hardware Medium / FTEST_daos_test.DAOS_Pool.POOL14: pool connect access based on ACL
    • https://daosio.atlassian.net/browse/DAOS-10301
  • Functional Hardware Medium / FTEST_osa.OSAOfflineDrain.3-./osa/offline_drain.py:OSAOfflineDrain.test_osa_offline_drain_after_snapsot
    • https://daosio.atlassian.net/browse/DAOS-10607
  • Functional Hardware Large / FTEST_erasurecode.EcodDisabledRebuildSingle.1-./erasurecode/rebuild_disabled_single.py:EcodDisabledRebuildSingle.test_ec_degrade_single_value
    • https://daosio.atlassian.net/browse/DAOS-9736
  • Functional Hardware Large / FTEST_erasurecode.EcodDisabledRebuildSingle.2-./erasurecode/rebuild_disabled_single.py:EcodDisabledRebuildSingle.test_ec_degrade_single_value
    • https://daosio.atlassian.net/browse/DAOS-9736
  • Functional Hardware Large / FTEST_ior.EcodIorHardRebuild.1-./ior/hard_rebuild.py:EcodIorHardRebuild.test_ec_ior_hard_online_rebuild
    • https://daosio.atlassian.net/browse/DAOS-10939
  • Functional Hardware Large / FTEST_ior.EcodIorHardRebuild.3-./ior/hard_rebuild.py:EcodIorHardRebuild.test_ec_ior_hard_online_rebuild
    • https://daosio.atlassian.net/browse/DAOS-10939
  • Functional Hardware Large / erasurecode-rebuild_disabled.framework_results
    • test was skipped due to host communication error attempting to clean out leftover logs from a previous test run prior to running this test
  • Functional Hardware Large / mdtest-small.framework_results
    • test was skipped due to host communication error attempting to clean out leftover logs from a previous test run prior to running this test
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.02-./pool/create_all_hw.py:PoolCreateAllHwTests.test_one_pool
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.03-./pool/create_all_hw.py:PoolCreateAllHwTests.test_one_pool
  • Functional Hardware Large / FTEST_rebuild.RbldWidelyStriped.2-./rebuild/widely_striped.py:RbldWidelyStriped.test_rebuild_widely_striped
  • Functional Hardware Large / FTEST_rebuild.RbldWithIOR.1-./rebuild/with_ior.py:RbldWithIOR.test_rebuild_with_ior

phender avatar Aug 10 '22 16:08 phender

Test stage Functional on EL 8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-9892/2/display/redirect

daosbuild1 avatar Aug 11 '22 06:08 daosbuild1

Failures in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-9892/3/ (so far):

phender avatar Aug 12 '22 12:08 phender

Test failures in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-9892/3:

  • Functional on EL 8 / FTEST_rebuild.RbldCascadingFailures.2-./rebuild/cascading_failures.py
    • VM slowness caused this odd 'failure'. Test actually passed and completed tearDown, but avocado reported "Test reported status but did not finish" which is due to having only 5.11 seconds to complete tearDown.
    • Does not appear to be related to these changes
  • Functional on EL 8 / FTEST_rebuild.RbldContRfTest.5-./rebuild/container_rf.py
    • https://daosio.atlassian.net/browse/DAOS-11291
  • Functional Hardware Medium / FTEST_daos_test.DaosCoreTestRebuild.04-./daos_test/rebuild.py
    • https://daosio.atlassian.net/browse/DAOS-11180 (fix landed after this run)
  • Functional Hardware Medium / FTEST_osa.OSAOfflineDrain.3-./osa/offline_drain.py
    • https://daosio.atlassian.net/browse/DAOS-10607
  • Functional Hardware Large / FTEST_erasurecode.EcodDisabledRebuild.3-./erasurecode/rebuild_disabled.py
    • https://daosio.atlassian.net/browse/DAOS-10916 (fix landed after this run)
  • Functional Hardware Large / FTEST_erasurecode.EcodDisabledRebuildSingle.1-./erasurecode/rebuild_disabled_single.py
    • https://daosio.atlassian.net/browse/DAOS-11234
  • Functional Hardware Large / FTEST_erasurecode.EcodDisabledRebuildSingle.2-./erasurecode/rebuild_disabled_single.py
    • https://daosio.atlassian.net/browse/DAOS-11234
  • Functional Hardware Large / FTEST_ior.EcodIorHardRebuild.1-./ior/hard_rebuild.py
    • https://daosio.atlassian.net/browse/DAOS-10658
  • Functional Hardware Large / FTEST_ior.EcodIorHardRebuild.2-./ior/hard_rebuild.py
    • https://daosio.atlassian.net/browse/DAOS-10658
  • Functional Hardware Large / FTEST_ior.EcodIorHardRebuild.3-./ior/hard_rebuild.py
    • https://daosio.atlassian.net/browse/DAOS-10658

phender avatar Aug 17 '22 18:08 phender

Only 1 failure in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-9892/4/: https://daosio.atlassian.net/browse/DAOS-11380

phender avatar Aug 19 '22 14:08 phender