container icon indicating copy to clipboard operation
container copied to clipboard

Multibyte characters are garbled

Open kazuki43zoo opened this issue 3 years ago • 3 comments

When I compile a Java code with multibyte characters using javac command, failed compile.

Java Source Code

public class Test {
  public static void main(String[] args) {
   System.out.println("↓");
  }
}

Procedures

docker run -v $(pwd):/work -it -w /work \
  ghcr.io/graalvm/graalvm-ce:22.0.0.2 bash
bash-4.4# javac Test.java
Test.java:3: error: unmappable character (0xE2) for encoding US-ASCII
   System.out.println("???");
                       ^
Test.java:3: error: unmappable character (0x86) for encoding US-ASCII
   System.out.println("???");
                        ^
Test.java:3: error: unmappable character (0x93) for encoding US-ASCII
   System.out.println("???");
                         ^
bash-4.4#

Solution for this behavior

I've fixed to change environment variable value of LANG to C.utf8 from en_US.UTF-8.

docker run -v $(pwd):/work -it -w /work \
  -e LANG=C.utf8 \
  ghcr.io/graalvm/graalvm-ce:22.0.0.2 bash

It's working fine.

bash-4.4# javac Test.java
bash-4.4# java Test
↓
bash-4.4#

Proposal

I will propose to change default LANG value to C.utf8 from en_US.UTF-8.

p.s:

The en_US.UTF-8 not installed on container by default.

bash-4.4# locale -a
C
C.utf8
POSIX
bash-4.4#

kazuki43zoo avatar Feb 07 '22 16:02 kazuki43zoo

Thank you for your report, looking into this.

mlouriz avatar Feb 09 '22 10:02 mlouriz

The same here. See also https://github.com/korandoru/hawkeye/issues/57.

tisonkun avatar Mar 21 '23 06:03 tisonkun

Hacking fix as https://github.com/korandoru/hawkeye/commit/8433b641bc338078482fc0b791b27b8df3e80e88.

Alter the default lang can be a breaking change. I may prefer generate en_US.UTF-8 instead so I can remove the ENV command and make things work by default.

tisonkun avatar Mar 21 '23 06:03 tisonkun