sops icon indicating copy to clipboard operation
sops copied to clipboard

SOPS displacing comments and causing YAML-breaking invalid syntax error and data loss

Open lucabello opened this issue 3 years ago • 9 comments

Hello everyone; I found a bug in the SOPS encryption, which you can replicate yourself.

Unencrypted file test.yaml:

this:
  is:
    an: example
  # comment

I use age encryption, so I set up the SOPS_AGE_RECIPIENTS and SOPS_AGE_KEY_FILE variables, and then run sops --encrypt --encrypted-regex an test.yaml, obtaining this:

this:
    is:
        an: ENC[AES256_GCM,data:7ew99FdoYQ==,iv:62k8jSLjg2l7YyorqOf9bobpgT7m+1qeEK66JpyObTE=,tag:caRiamKAO9TqZHj2qfKAgA==,type:str]
sops:
# comment
    kms: []
    gcp_kms: []
    azure_kv: []
    hc_vault: []
    age:
        - recipient: age107p580kfl3mlr2x7jqfhattzv0a3kd88ugjq2p5tsjz59jzdmdfsj2qwm2
          enc: |
            -----BEGIN AGE ENCRYPTED FILE-----
            YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBZa3VuTFBucnV4QldrZW0z
            am9jTkw1ejZIUjBUVHpRTjRONUNTYnRXWFVRCncwdk5adnpDZktnK3JwVWU0R1o4
            UGlIOHpSdUpTQjBhWnMwcXAyZlhmTFEKLS0tIFk2bitXOHN2T2VIbHZzVnVlTVpn
            VGNUSkdFWENLSEFQdGh4OW1TMk1HZHMKvMCQo5SXkMk9HQ4Bv7n1UFj+IZDi0z8K
            1LJKCovXp7d+zceCwbVM9zfbleF7AvFuyWSbw7o11WpE4vtaD8fDWg==
            -----END AGE ENCRYPTED FILE-----
    lastmodified: "2022-05-31T14:38:08Z"
    mac: ENC[AES256_GCM,data:BNPYx43KNPdggGC/+qgHQfoMj+ctnrwjOpt+pOzoWcQCcS3IP1h0FtD/UcSc8mLTxS2+6gBBiFU8JIdnr8XwueIeGbcx0zEk8LZdspv0DuVQoC5AbiU4qnvCFRY7MIdmSKacG7zXVN4nQ2kW/Ip7jqhBAqZJl5cbUBcaQOgfGCo=,iv:aLQ+lhmgxzyrQphjknd1O1KHC6BVG1E0hM9tCfFs+J4=,tag:sHhaEqLU+/VRDRjTajAnwQ==,type:str]
    pgp: []
    encrypted_regex: an
    version: 3.7.3

As you can see, the # comment line is moved inside the sops metadata; whenever that file is decrypted, the comment will be gone (hence the data loss I mention in the title).

The reason why this is important is because my team is using some comment "tags" to mark code blocks, and the last tag disappears after encrypting-and-decrypting a file.

I did some extra testing to try to understand when this was happening, and I gathered some conclusions: it only happens if the indentation is at least the one in my example file (so at least 3) and the if the comment is "less indented" than the last line.

Sorry if this is a known issue and I missed it! :)

lucabello avatar May 31 '22 14:05 lucabello

Adding to this issue, it can break a YAML file, by producing an invalid syntax.

To reproduce, use this starting file test.yaml:

this:
  is:
    an: example
  # comment
 
someArray: []

Again, run sops --encrypt --encrypted-regex an test.yaml. This will produce:

this:
    is:
        an: ENC[AES256_GCM,data:BVubel9Etw==,iv:eSU4vD0Gl+ArqJYXz601lTYCcpuXGfdMj5ViU2jGOaE=,tag:Du0JCAnjc+pGYBRBqpIg+w==,type:str]
someArray:
# comment
[]

sops:
    kms: []
    gcp_kms: []
    azure_kv: []
    hc_vault: []
    age:
        - recipient: age1atpv5v2yrfuxxpe2gpddfq2n5g7vck89eg8fjqytrjrxy7plqfcs7l7788
          enc: |
            -----BEGIN AGE ENCRYPTED FILE-----
            YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBrUDRtTjRCaktvbnhFQVpq
            ZGRyVm9rL2YrOUNYdnExOXVlQWhiczhZOVNvClFBVTZyWjh4ck53aE55cC9pZThv
            NExTS3BpZFdpWG9tMFZOeUZ3dFNtYncKLS0tIFRBTmYwNzd3dXZGN3pOYlBraS9y
            aklTbVpZNm54MjNVaXQ5bTRCQURkWU0KkoNrtSRxjGBjCGjAR8HRZqRH8tAZ432B
            HP7spoDOqtsRUuZPhBqQpvV+mP50z9ztNqrw5f8MDIirXsiw1qvYyg==
            -----END AGE ENCRYPTED FILE-----
    lastmodified: "2022-05-31T16:21:18Z"
    mac: ENC[AES256_GCM,data:BBYePXfdlh6EwLdvft7OncjUWpE2WrLdhtaKNONPZymdv7q23QEGJJxG3ktE02PcdS2aDsflwM3wSkPq5PPQSQXVZbtIMutKCdrKZGerEPUSF6KXy00CqiLn12kbO31JIisw5K+JvmhQ0vxopPAn0o4w0hABvy9Pt/ZKFt8pjkU=,iv:n+hzjCnrxQL986BmrTbToS3dWiWKPK8uRxdN+Dqc8xA=,tag:CGKCtco5LiZyhAbInkBaAQ==,type:str]
    pgp: []
    encrypted_regex: an
    version: 3.7.3

As you can see, # comment got in the middle of the empty array definition. This breaks the YAML syntax; the file can't even be decrypted anymore!

lucabello avatar May 31 '22 16:05 lucabello

Thanks for this report! I can confirm that this happens with the latest version.

felixfontein avatar May 31 '22 17:05 felixfontein

Interestingly I cannot reproduce this when removing the empty line after the comment... Then in both cases it works fine. But if there is an empty line after the comment, the error happens.

felixfontein avatar May 31 '22 17:05 felixfontein

#1069 should fix this issue. There is the other problem that if there is no newline, that the comment will be parsed as a foot comment for the innermost map entry (an: example) and thus it will be indented on another level after re-serialization, but that seems to be something caused by yaml.v3.

felixfontein avatar May 31 '22 19:05 felixfontein

I also noticed it can displace multiple comments into the sops metadata, as long as they are in consecutive lines after the # comment in my example and they are "indented enough"

But anyway, thanks for addressing it quickly! Hope it gets reviewed and merged soon :)

lucabello avatar Jun 01 '22 07:06 lucabello

Just wanted to add to this issue, as i think it might be the same problem. There is a helm chart values.yaml file for the kube-prometheus-stack located at https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml that also causes invalid syntax when you try to encrypt and decrypt the file.

# sops --version
sops 3.7.3 (latest)

It also seems like the output of the encryption removes all blank lines that might have been present in the original file.

In addition, some of the indention is also getting changed. For example the first entry, "defaultRules:"

BlackBsd avatar Sep 14 '22 20:09 BlackBsd

@BlackBsd indeed, there are more such bugs. A minimal reproducer for the issue you reported is

a:
  b:
    c:
      d: ""
    # e

# f
g: {}

It also seems like the output of the encryption removes all blank lines that might have been present in the original file.

In addition, some of the indention is also getting changed. For example the first entry, "defaultRules:"

These are both normal and to be expected. Sops should produce semantically equivalent YAML files (including comments), not the original YAML file back.

felixfontein avatar Sep 15 '22 19:09 felixfontein

Meh. Now I tried to spend quite some time to figure out why this is still not fixed, only to realize that the fix merged (#1069) works fine, but hasn't been released yet.

So well, this is already fixed on the develop branch since beginning of June...

felixfontein avatar Sep 18 '22 13:09 felixfontein

That is awesome, thank you. When do you foresee this making it into a new release?

BlackBsd avatar Sep 19 '22 14:09 BlackBsd

I wish I would know... @ajvb do you have any indication when a new (bugfix) release will happen?

felixfontein avatar Oct 06 '22 11:10 felixfontein

any update on this?

strowi avatar Nov 04 '22 16:11 strowi

There has not been a new release, so: no. Unfortunately not. :(

felixfontein avatar Nov 27 '22 20:11 felixfontein

This will be included in the upcoming release.

hiddeco avatar Aug 24 '23 16:08 hiddeco

In version 3.8.1 It is still displacing comments in YAML

digitalstudium avatar Feb 29 '24 14:02 digitalstudium