GitPython icon indicating copy to clipboard operation
GitPython copied to clipboard

gpg fails to find key due to overriden locale

Open jgalar opened this issue 4 years ago • 3 comments

Description

I am using GitPython to sign tags as such:

repo.git.tag(
            "-s",
            "v{}".format(str(new_version)),
            "-m Version {}".format(str(new_version)),
        )

This fails with the following error/exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/jgalar/EfficiOS/src/reml/reml/cli.py", line 93, in main
    release = project.release(
  File "/home/jgalar/EfficiOS/src/reml/reml/project.py", line 443, in release
    self._commit_and_tag(new_version)
  File "/home/jgalar/EfficiOS/src/reml/reml/lttngtools.py", line 41, in _commit_and_tag
    self._repo.git.tag(
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/git/cmd.py", line 545, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/git/cmd.py", line 1011, in _call_process
    return self.execute(call, **exec_kwargs)
  File "/home/jgalar/.cache/pypoetry/virtualenvs/reml-06EzQDmo-py3.9/lib/python3.9/site-packages/git/cmd.py", line 828, in execute
    raise GitCommandError(command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git tag -s v2.12.4 -m Version 2.12.4
  stderr: 'error: gpg failed to sign the data
error: unable to sign the tag'

Debugging the problem

I was initially confused since I can sign tags correctly from the command line using the git tag -s ... command directly.

Digging a bit, I saw that git invokes gpg with the following arguments both when I tag using the gitclient directly or when using GitPython.

argv[0] = /usr/bin/gpg2
argv[1] = --status-fd=2
argv[2] = -bsau
argv[3] = Jérémie Galarneau <[email protected]>

This seemed to point to something more subtle, possibly related to the process's environment.

I couldn't find a way for git to provide the stderr of gpg to get a better error report. Thus, I modified GnuPG 2.2.27 and rebuilt it to output the errors to a log file. This yielded the following error:

[GNUPG:] INV_SGNR 9 Jérémie Galarneau <[email protected]>
[GNUPG:] FAILURE sign 17
gpg: signing failed: No secret key

If you look at the first line, you will see that the é characters in my name were changed to é. This typically happens when my name is converted from UTF-8 to ISO/IEC 8859-1.

This clued me in that something funny related to locales was happening.

I dumped and compared the environment of the gpg process in the two scenarios (CLI use and GitPython) and saw that the only meaningful difference was that the LANGUAGE and LC_ALL environment variables were set to C when GitPython was involved.

Indeed, invoking git with LC_ALL="C" LANGUAGE="C" git tag -s [...] reproduced the problem.

Cause

Looking at the GitPython code, I found that git is invoked with those environment variables set: https://github.com/gitpython-developers/GitPython/blame/b3778ec/git/cmd.py#L694

I am unsure what "parsing code" the comments are referring to so I can't comment on the reasons why this is done. However, forcing a C locale will cause these kinds of erroneous encoding conversions for people who, like me, have non-ASCII names.

For what it's worth, my locale is LANG=en_CA.utf8.

I would guess that forcing the locale to any UTF-8 English locale would work around most issues and still provide GitPython with an English output.

My workaround

I found out that setting the signingKey property to my KEYID in my .gitconfig causes git to invoke gpg with the KEYID instead of the name property.

[user]
	name = Jérémie Galarneau
	email = [email protected]
	signingKey = MY_KEY_ID_HERE

This works around the problem since no accented letters are used in the gpg invocation.

jgalar avatar Feb 23 '21 04:02 jgalar

I am absolutely amazed by this issue and write up which is nothing short of an exciting detective story - thanks for that!

GitPython should definitely not enforce ASCII anywhere even though it requires an English locale for parsing its output, and I would hope you will find the time to submit a PR implementing the suggestion provided here:

I would guess that forcing the locale to any UTF-8 English locale would work around most issues and still provide GitPython with an English output.

It would be a good opportunity to embed Canada (加拿大) literally into GitPython's codebase :).

Byron avatar Feb 23 '21 15:02 Byron

I would hope you will find the time to submit a PR implementing the suggestion provided here

Sure, I'm glad to see that this isn't a non-starter :) I'm just not sure what fix you would find suitable/clean enough.

It would be a good opportunity to embed Canada (加拿大) literally into GitPython's codebase :).

I'm afraid few users will have en_CA.utf8 available. The good thing is that if a locale is not available, libc falls back to the default C, which at least doesn't break more things.

I would propose we choose a locale that we think will be available to most users and rely on falling-back to C in the rare cases where it isn't. In theory C.UTF-8 would be a good fit, but it's not available everywhere (Manjaro and Arch Linux come to mind).

In my experience, en_US.utf8 is pretty widely available, but I'm not sure what is typically available on Windows, macOS, and the various BSDs. I can look into it a bit more.

Otherwise, we can also go all-out and list the locales on the system and look for one that matches en_..\.utf8, but I'm not sure it's worth it.

What do you think?

jgalar avatar Feb 23 '21 21:02 jgalar

Thanks for sharing your insight - if I remember correctly what happened…erm…a decade ago I was just trying our locales to find one that works everywhere, with UTF-8 not being anything I would know or be concerned about 😅.

It sounds like C.UTF-8 would be preferable, but a quick check revealed that at least on MacOS it's not available. en_US.UTF-8 is though, along with many other english speaking countries.

If libc indeed falls back to C it should be safe to go with en_US.UTF-8, otherwise it might be worth to check available locales using the seemingly available locale python module.

Maybe you could try FOO.UTF-8 and see if it does indeed work as expected just to be sure.

Thanks again!

Byron avatar Feb 24 '21 14:02 Byron