prometheus icon indicating copy to clipboard operation
prometheus copied to clipboard

change permissions

Open marvinnitz18 opened this issue 2 years ago • 13 comments

This Pull request tries to fix the container starting and landing in a reboot loop because of permission issues, as discussed in: https://github.com/prometheus/prometheus/issues/3441#issuecomment-1616114258

a user tried a custom build image where he modified the dockerfile in the way suggested in this pull request. The reasons to change it are outlined here: https://github.com/prometheus/prometheus/issues/3441#issuecomment-1616114258

@SuperQ or @roidelapluie do you need any further context to review this ?

marvinnitz18 avatar Aug 21 '23 10:08 marvinnitz18

Please supply a description, saying why you think we should do this.

Also please follow this instruction from the PR template:

- Please sign CNCF's Developer Certificate of Origin and sign-off your commits by adding the -s / --signoff flag to `git commit`. See https://github.com/apps/dco for more information.

bboreham avatar Aug 21 '23 11:08 bboreham

@bboreham thanks for the advice i changed Signed-off-by to my email address now.

marvinnitz18 avatar Aug 21 '23 11:08 marvinnitz18

@bboreham can we continue the review now ?

marvinnitz18 avatar Oct 16 '23 13:10 marvinnitz18

OK, I see you're trying to fix an issue from 2017, with comments up to 2020.

Please describe why this solution is correct. Assume I don't know what all the flags to "chgrp" do.

bboreham avatar Oct 16 '23 14:10 bboreham

Hi @SuperQ / @roidelapluie, do you have enough context to review this?

We looked at this in our bug scrub but unfortunately we didn't have enough context to review this.

@marvinnitz18 Could you update the PR description to clearly outline the problem and why this is a good solution?

gouthamve avatar Nov 07 '23 11:11 gouthamve

sure @gouthamve sorry that i took so long to answer, i will update it just now

marvinnitz18 avatar Jan 08 '24 14:01 marvinnitz18

@gouthamve can we continue the review ? if i am missing anything please let me know i want to get this moving

marvinnitz18 avatar Feb 09 '24 13:02 marvinnitz18

Please describe why this solution is correct. Assume I don't know what all the flags to "chgrp" do.

Please can you do this, in this PR. Don't ask us to go clicking on other links with pages of text to read. Also take a look at the CI failure(s).

bboreham avatar Feb 12 '24 10:02 bboreham

@bboreham i will :+1: the CI failure was fixed already but i don't know how to rerun the pipeline, could you do that for me ? do i need to reopen this merge request ?

marvinnitz18 avatar Feb 12 '24 12:02 marvinnitz18

I don't think we should actually be doing any permissions / ownership operations within the container. This should be managed outside by the volume mount management.

the problem is that the container atm now is doing permission changes, when started it changes the permissions of the mounted volume or in my case local folder to nobody:nogroup, this MR should just prevent it from doing that

marvinnitz18 avatar Feb 12 '24 12:02 marvinnitz18

OK, I see you're trying to fix an issue from 2017, with comments up to 2020.

Please describe why this solution is correct. Assume I don't know what all the flags to "chgrp" do.

Sure, it changes the group ownership to 0 which on Unix would be the group ID 0, which i think is better then nobody:nogroup Chmod tries to set the permissions correctly aftwards, i don't know if they are correct, you must tell me, but the chrp would solve me immediate problem described in this thread

marvinnitz18 avatar Feb 12 '24 12:02 marvinnitz18

Please describe why this solution is correct. Assume I don't know what all the flags to "chgrp" do.

Sure, it changes the group ownership to 0 which on Unix would be the group ID 0, which i think is better then nobody:nogroup

Sorry I am not getting my point across.

When I ask you to explain flags, I expect you to explain the -R flag you used.

When I ask you to describe why this solution is correct, I expect more than "I think it's better". I am looking for some justification.

Regarding the CI error, please take a look at the bottom of this page, where it describes a mismatch in email address: https://github.com/prometheus/prometheus/pull/12728/checks?check_run_id=16065898938

bboreham avatar Feb 12 '24 14:02 bboreham

the problem is that the container atm now is doing permission changes, when started it changes the permissions of the mounted volume or in my case local folder to nobody:nogroup, this MR should just prevent it from doing that

The RUN happens at build time, not when the container is started, the part you're proposing to change is needed to ensure /prometheus is writable by default.

With this change simply doing a docker run without a volume doesn't work:

caller=query_logger.go:93 level=error component=activeQueryTracker msg="Error opening query log file" file=/prometheus/queries.active err="open /prometheus/queries.active: permission denied"
panic: Unable to create mmap-ed active query log

While that's not a common use case we need to support that for testing at least.

dgl avatar Feb 12 '24 20:02 dgl

Please describe why this solution is correct. Assume I don't know what all the flags to "chgrp" do.

Sure, it changes the group ownership to 0 which on Unix would be the group ID 0, which i think is better then nobody:nogroup

Sorry I am not getting my point across.

When I ask you to explain flags, I expect you to explain the -R flag you used.

When I ask you to describe why this solution is correct, I expect more than "I think it's better". I am looking for some justification.

Regarding the CI error, please take a look at the bottom of this page, where it describes a mismatch in email address: https://github.com/prometheus/prometheus/pull/12728/checks?check_run_id=16065898938

@bboreham no worries you got it across well i just did not provide the right answers :laughing:

I used -R because it needs to change the rights recursively to allow rootless execution of the container, to be honest my main justification is that redhat did it like that as mentioned in: https://github.com/prometheus/prometheus/issues/3441#issuecomment-1956537913

Thanks for pointing me in the right direction to fix the CI error, i will look into it

marvinnitz18 avatar Feb 22 '24 13:02 marvinnitz18