Adds logging when jobs fail with number of attempts and max attempts
Description
Adds logging when jobs fail with number of attempts and max attempts.
attempt ($attempt of $max_attempts)
Performance impact
Unknown
Security impact
Unknown
Checklist
- [X] My code matches the project's code style and
yarn lint:fixpasses. - [ ] I've added tests for the new feature, and
yarn testpasses. - [ ] I have detailed the new feature in the relevant documentation.
- [ ] I have added this feature to 'Pending' in the
RELEASE_NOTES.mdfile (if one exists). - [ ] If this is a breaking change I've explained why.
I could use some help calculating the next fuzz delay though.
From the README, the attempts do backoff along these lines:
select attempt, exp(least(10, attempt)) * interval '1 second' as delay, sum(exp(least(10, attempt)) * interval '1 second') over (order by attempt asc) total_delay from generate_series(1, 24) as attempt;
So we can work out the delay until the next job as:
function delayInMs(attempt: number): number {
return Math.exp(Math.min(10, attempt)) * 1000;
}
To make this fuzzy we just want the nearest human-readable size. I'd do something like:
const MILLISECOND = 1;
const SECOND = 1000 * MILLISECOND;
const MINUTE = 60 * SECOND;
const HOUR = 60 * MINUTE;
// We know the maximum attempt delay is about 6 hours, so no need to go any further.
function fuzzyDelay(attempt: number): string {
const ms = delayInMs(attempt);
if (ms > 1.5 * HOUR) {
return `~${(ms/HOUR).toFixed(1)} hours`;
} else if (ms >= 1.5 * MINUTE) {
return `~${(ms/MINUTE).toFixed(1)} minutes`;
} else {
return `~${Math.round(ms/SECOND)} seconds`;
}
}
This was all written direct into GitHub so you'll need to check it for errors!
I'm a bit confused by my quote message above because it doesn't seem to be something you wrote here that I was replying to. Perhaps it was on chat or somewhere, but I failed to track it down. Either way, it looks like we won't be addressing that, so I've moved the error message about a bit and merged in the latest. This is good to go now; logging looks like:
[worker(worker-c3465d98e066eea4d0)] DEBUG: Found task 15 (charm)
[worker(worker-c3465d98e066eea4d0)] ERROR: Failed task 15 (charm, 0.32ms, attempt 1 of 25) with error 'Throwing error because attempt 1 < 3':
Error: Throwing error because attempt 1 < 3
at module.exports (.../perfTest/tasks/charm.js:3:11)
at doNext (.../src/worker.ts:206:24)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
[worker(worker-6f2d6ccd8f1b45bd1c)] DEBUG: Found task 15 (charm)
[worker(worker-6f2d6ccd8f1b45bd1c)] ERROR: Failed task 15 (charm, 0.15ms, attempt 2 of 25) with error 'Throwing error because attempt 2 < 3':
Error: Throwing error because attempt 2 < 3
at module.exports (.../perfTest/tasks/charm.js:3:11)
at doNext (.../src/worker.ts:206:24)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
[worker(worker-b04aa658a0c2821f4c)] DEBUG: Found task 15 (charm)
Third time's the charm!
[worker(worker-b04aa658a0c2821f4c)] INFO: Completed task 15 (charm, 0.17ms, attempt 3 of 25) with success
[worker(worker-c3465d98e066eea4d0)] DEBUG: Found task 17 (wait)
[worker(worker-c3465d98e066eea4d0)] INFO: Completed task 17 (wait, 30002.08ms) with success
Ah; the quoted text came from https://github.com/graphile/worker/issues/235#issuecomment-969388627