Chao Li
Chao Li
One of our cluster hit the same issue today. After a few hours investigation, I believe this is a bug introduced by #8100 since 7.8.0. This bug is triggered by...
@dhantha The root cause is that invalid entries in `worker_resource_caches` are not properly GC-ed. I'm working on a fix. Before the fix is released, a workaround is to manually cleanup...
@xtremerui Does this problem affect all privileged steps? We are planning to upgrade to 7.6.0. If this problem is also in 7.6.0, then we have to postpone until the problem...
@vito I would be "aggressively" pushing on this issue. I have noticed that many of pipelines on our clusters are generating huge build logs. As @SimonXming mentioned, some huge build...
@xtremerui I don't think the two metrics are only introduced in 7.8. We found that in 7.8 because we upgraded directly to 7.8 from 6.7.8.
I think this PR should only be merged once #8070 is fixed. Otherwise it may end up that all containers go to the same worker and image gets no chance...
@andy-paine Can you please rebase this PR? Because it doesn't contain change of #8061 but Github seems not detecting that.
@ArunSenthoorPandian As far as I know, till the latest version (7.3.2), it's unavailable yet.
> I'd like to see the checks related metrics to verify the change not dropping check rates at least e.g. checks_started before/after the PR. Check metrics doesn't show much difference...
@xtremerui I think I stated in https://github.com/concourse/concourse/pull/8056#issuecomment-1041011754 that performance improvement is not the key of this PR. Instead, I want to have an option to restrict max number of resources...