netchecks icon indicating copy to clipboard operation
netchecks copied to clipboard

Query about resource limits and requests

Open mandarbhosle opened this issue 1 year ago • 3 comments

For namespaces having resource quotas, the netcheck assertion pod doesn't come up. As it doesn't have the resource limit and request.

I tried adding it to the values.yaml file under probeconfig and did reinstall. Still it doesn't pick up.

Is it something i am missing. Also in the configmap.yaml templa didn't find the code to pick it up.

Also is there any parameter to reduce the wait time for fail check rules

mandarbhosle avatar May 13 '24 08:05 mandarbhosle

That's certainly a valid feature request. The way to support it would be to modify the configmap.yaml in the helm template to persist all of probeConfig to the configmap.

Adjust the create_job_spec in operator/netchecks_operator/main.py to use the given resources similar to how the image and pull policy is loaded from the config:

container = client.V1Container(
        name="netcheck",
        # e.g "ghcr.io/hardbyte/netchecks:main"
        image=f"{settings.probe.image.repository}:{settings.probe.image.tag}",
        image_pull_policy=settings.probe.image.pullPolicy,

settings.probe is defined as a Pydantic BaseModel in netchecks_operator/config.py - that actually already has resources.

Happy to review a PR if you'd like to implement it.

hardbyte avatar May 13 '24 11:05 hardbyte

Hello,

Pull request https://github.com/hardbyte/netchecks/pull/155 to include the resource request and limits. Request your review

Also is there any parameter to reduce the wait time for fail check rules. Currently it takes 2 min to report a failure rule. So it takes time to run the entire policy if there are multiple rules.

mandarbhosle avatar May 23 '24 07:05 mandarbhosle

Hello,

Updated the code further to include resource quotas, node affinity

Regards, Mandar Bhosle

mandarbhosle avatar Jun 23 '24 23:06 mandarbhosle