Multibyte characters are garbled
When I compile a Java code with multibyte characters using javac command, failed compile.
Java Source Code
public class Test {
public static void main(String[] args) {
System.out.println("↓");
}
}
Procedures
docker run -v $(pwd):/work -it -w /work \
ghcr.io/graalvm/graalvm-ce:22.0.0.2 bash
bash-4.4# javac Test.java
Test.java:3: error: unmappable character (0xE2) for encoding US-ASCII
System.out.println("???");
^
Test.java:3: error: unmappable character (0x86) for encoding US-ASCII
System.out.println("???");
^
Test.java:3: error: unmappable character (0x93) for encoding US-ASCII
System.out.println("???");
^
bash-4.4#
Solution for this behavior
I've fixed to change environment variable value of LANG to C.utf8 from en_US.UTF-8.
docker run -v $(pwd):/work -it -w /work \
-e LANG=C.utf8 \
ghcr.io/graalvm/graalvm-ce:22.0.0.2 bash
It's working fine.
bash-4.4# javac Test.java
bash-4.4# java Test
↓
bash-4.4#
Proposal
I will propose to change default LANG value to C.utf8 from en_US.UTF-8.
p.s:
The en_US.UTF-8 not installed on container by default.
bash-4.4# locale -a
C
C.utf8
POSIX
bash-4.4#
Thank you for your report, looking into this.
The same here. See also https://github.com/korandoru/hawkeye/issues/57.
Hacking fix as https://github.com/korandoru/hawkeye/commit/8433b641bc338078482fc0b791b27b8df3e80e88.
Alter the default lang can be a breaking change. I may prefer generate en_US.UTF-8 instead so I can remove the ENV command and make things work by default.