aider icon indicating copy to clipboard operation
aider copied to clipboard

Emoji input causes display issues and a crash (UnicodeEncodeError) on Windows Terminal

Open kt-devoss opened this issue 1 year ago • 1 comments

Hi,

Summary

When entering instructions that include emoji characters like 🤖,

(A) the emoji is not displayed correctly in the terminal, and
(B) the application crashes during processing. UnicodeEncodeError: 'utf-8' codec can't encode characters in position 9509-9510: surrogates not allowed


Auto-generated Report

Aider version: 0.82.1 Python version: 3.12.5 Platform: Windows-10-10.0.19045-SP0 Python implementation: CPython Virtual environment: Yes OS: Windows 10 (64bit) Git version: git version 2.43.0.windows.1

An uncaught exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "__main__.py", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "main.py", line 1146, in main
    coder.run()
  File "base_coder.py", line 856, in run
    self.run_one(user_message, preproc)
  File "base_coder.py", line 903, in run_one
    list(self.send_message(message))
  File "base_coder.py", line 1339, in send_message
    utils.show_messages(messages, functions=self.functions)
  File "utils.py", line 139, in show_messages
    print(formatted_output)
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 9509-9510: surrogates not allowed


Steps to Reproduce

0. Environment

  • OS: Windows 10 (Japanese) ver command says: Microsoft Windows [Version 10.0.19045.5737]
  • Terminal: Windows Terminal
  • Shell: PowerShell 7
  • Output of chcp: 65001 (= UTF-8)

1. Launch Aider

aider -v

2. Paste the following string (copied from clipboard)

Create a Python script that prints "Hello, Aider🤖"
  • Additional Notes:
    • The emoji 🤖 is rendered as ?? in Aider's user input prompt, which is not expected behavior. ← (A)

    • However, the outer terminal (i.e. Windows Terminal) does support emoji input/output, as demonstrated below:

PS > echo 'Create a Python script that prints "Hello, Aider🤖"'
Create a Python script that prints "Hello, Aider🤖"

3. Crashe occurs

An uncaught exception occurs.
Aider crashes. ← (B)

Expected Behavior

(A) Input containing special characters (such as emojis) should be displayed correctly, not as ??.

(B) Input containing special characters should not cause a crash — it should be processed appropriately or at least skipped gracefully.


P.S. Thank you for great software!

kt-devoss avatar Apr 19 '25 11:04 kt-devoss

(I realize that this follow-up might potentially warrant a separate issue, as the behavior differs from the originally reported crash.
However, since it seems to stem from the same underlying cause (emoji/surrogate handling?), I’m adding it here for now ...)

Additional Observation (Normal Mode: without -v)

A different behavior occurs when running Aider in Normal Mode (i.e., without the -v flag).

Unlike in Verbose Mode(i.e., with -v flag), as described above, where the application crashes immediately on emoji input, this mode does not crash, but instead enters a long series of retries and becomes increasingly unresponsive.

 Create a Python script that prints "Hello, Aider🤖"
  • Aider does not crash immediately, but a series of exceptions and retries occur.
  • The emoji still appears as ?? in the user input prompt, as previously noted.
  • The following error messages are shown repeatedly:
Exception 'utf-8' codec can't encode characters in position 49-50: surrogates not allowed
Press ENTER to continue...

litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 0.2 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 0.5 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 1.0 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 2.0 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 4.0 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 8.0 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 16.0 seconds...
litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
Retrying in 32.0 seconds...

litellm.APIError: APIError: OpenAIException - 'utf-8' codec can't encode characters in position 8278-8279: surrogates not allowed
  • If not explicitly interrupted by the user, the retries follow an exponential backoff pattern, reaching up to a maximum of 32 seconds per attempt.
  • As a result, the total accumulated wait time can exceed a full minute, leading to severely degraded responsiveness.

(Not sure, ... but this might involve litellm or something lower-level too — not just Aider. The behavior seems like it could have multiple causes.)

kt-devoss avatar Apr 19 '25 12:04 kt-devoss

I also had a similar error when I asked aider to "Please change all non-ascii utf-8 characters to ascii strings (e.g., utf-8 character for phi to ascii 'phi')" I think it generated a unicode character by itself which caused the problem and generated this:

Exception 'utf-8' codec can't encode character '\udce2' in position 144: surrogates not allowed
Press ENTER to continue...

litellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogates not
allowed
The API provider's servers are down or overloaded.
Retrying in 0.2 seconds...
litellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogates not
allowed
The API provider's servers are down or overloaded.
Retrying in 0.5 seconds...
litellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogates not
allowed
The API provider's servers are down or overloaded.
Retrying in 1.0 seconds...
âlitellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogat es not
allowed
The API provider's servers are down or overloaded.
Retrying in 2.0 seconds...
âlitellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogates no t
allowed
The API provider's servers are down or overloaded.
Retrying in 4.0 seconds...
litellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surroga tes not
allowed
The API provider's servers are down or overloaded.
Retrying in 8.0 seconds...
âlitellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surroga tes not
allowed
The API provider's servers are down or overloaded.
Retrying in 16.0 seconds...
âlitellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogates not allowed The API provider's servers are down or overloaded. Retrying in 32.0 seconds... litellm.InternalServerError: InternalServerError: OpenAIException - 'utf-8' codec can't encode character '\udce2' in position 45992: surrogates not allowed The API provider's servers are down or overloaded.

henrypfister avatar Jul 07 '25 18:07 henrypfister

This is the most commented bug, so I will leave the ping here, but at the time of writing this comment there are already 6 different bug reports on this exact same issue, one of which has a label https://github.com/Aider-AI/aider/labels/priority and other https://github.com/Aider-AI/aider/labels/bug.

For me the error is:

  File "C:\Users\marci\AppData\Roaming\uv\python\cpython-3.12.11-windows-x86_64-none\Lib\encodings\cp1250.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\u22ee' in position 0: character maps to <undefined>

22EE is the (Vertical Ellipsis) character.

The error persists even if I run it with --no-pretty

Steps to reproduce:

  1. Mock your repository with a file containing in the filename
  2. Run aider --no-pretty --show-repo-map > map.md
  3. ...
  4. Get the error

See also: #3844

Steps to bypass the bug:

If you use the PowerShell, run:

$env:PYTHONUTF8 = "1"
$env:PYTHONIOENCODING = "utf-8"

and only then do aider --no-pretty --show-repo-map > map.md

Environment:

aider 0.86.1 Windows 11 24H2 In both: Windows git bash AND Windows PowerShell

JareelSkaj avatar Aug 14 '25 20:08 JareelSkaj