Results 27 issues of Kai Fricke

**Is your feature request related to a problem? Please describe.** Command hook exit codes are not propagated to the buildkite agent: ``` steps: - label: ":pipeline:" agents: queue: runner_queue_branch plugins:...

And the training workers should specifically be included in the child tasks. This would avoid problems with dataset workers (e.g. modin, datasets). CC @matthewdeng @amogkam @Yard1

E.g. minimum time passed, minimum number of boosting rounds (absolute/relative) cc @mmui

enhancement

In order to keep development up to speed, we should consider speeding up CI tests or move them to buildkite

When trying to start a run with e.g. 900 actors when only 800 CPUs are available, the run silently hangs forever. In this case we would actually want to show...

In the latest release (1.1.0) some tests combining Ray Tune with FT were flaky. We should re-add them after the next release (1.2.0) to see if the issue has been...

There's a bunch of configs and some scripts for release testing, but it is currently unclear which clusters to start, which commands exactly to run, and what the output should...

Exit codes > 1 are not propagated to buildkite, thus e.g. automatic retry triggers that depend on exit code are failing. Pipeline: ``` steps: - label: ":pipeline:" agents: queue: runner_queue_branch...

bug
investigate

## Why are these changes needed? This PR moves our buildkite pipeline to a new hierarchical structure and will be used with the new buildkite pipeline. When merging this PR,...

The implementation of the progress reporter, particularly the table generation, suffers from cluttered legacy code. The functions are long, messy, and pass around long argument lists. It is hard to...

good first issue
tune
P2