commitizen commitizen_emoji cannot write to CHANGELOG

Description

This is because commitizen_emoji uses unicode emoji characters, but commitizen does not specify the file encoding in write_changelog(), so it defaults back to cp1252:

https://github.com/commitizen-tools/commitizen/blob/79165119817cbd0aef25434a85dcbffa3a13acbb/commitizen/commands/changelog.py#L92

This could be fixed by specifying the utf-8 encoding:

with open(self.file_name, "w", encoding='utf-8') as changelog_file: 
   ...

Steps to reproduce

> cz bump --changelog
bump: version 0.4.0 → 0.5.0
tag to create: 0.5.0       
increment detected: MINOR  

Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\hendra11\Code\external\cz-emoji\.venv\Scripts\cz.exe\__main__.py", line 7, in <module>
  File "C:\Users\hendra11\Code\external\cz-emoji\.venv\lib\site-packages\commitizen\cli.py", line 382, in main
    args.func(conf, vars(args))()
  File "C:\Users\hendra11\Code\external\cz-emoji\.venv\lib\site-packages\commitizen\commands\bump.py", line 215, in __call__
    changelog_cmd()
  File "C:\Users\hendra11\Code\external\cz-emoji\.venv\lib\site-packages\commitizen\commands\changelog.py", line 172, in __call__
    self.write_changelog(changelog_out, lines, changelog_meta)
  File "C:\Users\hendra11\Code\external\cz-emoji\.venv\lib\site-packages\commitizen\commands\changelog.py", line 103, in write_changelog
    changelog_file.write(changelog_out)
  File "C:\Program Files\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f389' in position 29: character maps to <undefined>

Current behavior

See Steps to reproduce

Desired behavior

Emojis can be logged to changelog

Screenshots

No response

Environment

cz version: 2.27.0
python version: 3.8.13
platform.system(): 'Windows'

May 22 '22 02:05 adam-grant-hendry

Maybe we could make the encoding configurable?

May 22 '22 10:05 Lee-W

'utf-8' should cover all encodings, but I'm fine with that. You'd have to change BaseCommitizen to grab that config variable and everywhere you have an open used in your codebase, change it to add the encoding.

May 22 '22 17:05 adam-grant-hendry

I seen the same issue on my end, and adding the encoding='utf-8' fixed it.

Jul 28 '22 11:07 fabiopedrosa

Hi @adam-grant-hendry we've release a new version of commitizen. Could you please try whether 2.30.0 solve this issue?

Aug 14 '22 03:08 Lee-W

After I test it, the problem still exists.

Aug 14 '22 07:08 Yusin0903

@Lee-W The problem still persists. I tried on 2.35.0.

You need to specify the utf-8 encoding where open() is used. That is how this can be fixed and it was additionally confirmed by @fabiopedrosa

Oct 14 '22 04:10 adam-grant-hendry

@Lee-W I remember now why I never submitted a PR: I am on Windows and your scripts/test and scripts/format expects /bin/sh, so I can't pass the pre-commit step linter and test.

This can be solved by using Git Bash: Git for Windows ships with git bash (i.e. a sh.exe and bash.exe).

This is another issue and requires another PR to fix.

Oct 14 '22 21:10 adam-grant-hendry

This can be solved by using Git Bash: Git for Windows ships with git bash (i.e. a sh.exe and bash.exe).

Actually, even easier than that: Just switch to using a portable shebang:

#!/usr/bin/env sh

Oct 14 '22 21:10 adam-grant-hendry

Hi @adam-grant-hendry , I notice you create another issue regarding to it. If pre-commit does not work, could you please use git commit ... --no-verify and git push ... --no-verify as a temporary solution before we can sort the /bin/sh issue out?

Oct 17 '22 09:10 Lee-W

It would be even more helpful if you could document this on the doc for windows users. Thanks!

Oct 17 '22 09:10 Lee-W

@woile @Lee-W I have a working implementation with passing unit tests for Unicode characters (that is, emojis). Once PR #605 is merged, I'll push it up.

Oct 23 '22 02:10 adam-grant-hendry

It would be even more helpful if you could document this on the doc for windows users. Thanks!

Yes, great idea. I'll make another PR for docs. I can probably also resolve Issue #515 there as well. I spewed a bunch of comments there and have had some time to learn and practice, so I might clean that up a little bit and clarify things.

Oct 23 '22 16:10 adam-grant-hendry