Help text has wrong encoding when "Beta: Use Unicode UTF-8 for worldwide language support" is enabled
Describe the bug I'm using the Chinese Simplified version of Windows 10 with "Beta: Use Unicode UTF-8 for worldwide language support" enabled. Help text in java is still encoded with GBK, so it cannot be displayed correctly. Other texts shown by java.exe are also affected. I don't know if this happens for other languages or not.
I also tryed AdoptOpenJDK, and it also have this problem. So maybe this is an upstream bug? Where should i report this?
Thanks
Steps to reproduce the behavior:
- Windows 10, Chinese Simplified
- Enable "Beta: Use Unicode UTF-8 for worldwide language support" in control panel
- java --help
- See error
Expected behavior Help text is shown correctly.
Screenshots

output of java --help, help text encoded with GBK while the system is using UTF-8
java-help.txt
Hi @everything411 - If this is also occurring with AdoptOpenJDK then the likely issue is with OpenJDK itself (or some common configuration that you need to set). We'll see if we can reproduce and advise on next steps (or submit an upstream issue on your behalf).
@gdams could you take a look into this please and verify if this also occurs on Adoptium?
@gdams could you take a look into this please and verify if this also occurs on Adoptium?
fixed in latest Adoptium JDK17-beta. still wrong encoding for Adoptium JDK16 JDK11 and JDK8
@everything411 could you please check if this happens with the MS Build of OpenJDK binaries? Which versions the problem appears, which don't?
@brunoborges
MS Build of OpenJDK 17: the problem don't appear MS Build of OpenJDK 11: the problem appears
so it seems that this problem is fixed in upstream jdk 17 but not in other versions of jdk?
@everything411 thanks for testing! If you don't mind one final question: is there an OpenJDK 11 build that you've seen that doesn't has this problem?
Maybe Zulu, or Oracle JDK?
Hi @everything411 @cyhhao
Could you please check if this issue is still happening with the packages published at microsoft.com/openjdk ?
@brunoborges
> chcp
Active code page: 65001
> java --version
openjdk 11.0.14.1 2022-02-08 LTS
OpenJDK Runtime Environment Microsoft-31205 (build 11.0.14.1+1-LTS)
OpenJDK 64-Bit Server VM Microsoft-31205 (build 11.0.14.1+1-LTS, mixed mode)
> java --version
openjdk 17.0.3 2022-04-19 LTS
OpenJDK Runtime Environment Microsoft-32931 (build 17.0.3+7-LTS)
OpenJDK 64-Bit Server VM Microsoft-32931 (build 17.0.3+7-LTS, mixed mode, sharing)
the same result as before, ok for jdk17 and bad encoding for jdk11.
i also notice that javac's help text still broken in both jdk11 and jdk17. these texts are GBK-encoded and then printed to the UTF-8 console, leading to these "�"
> javac
�÷�: javac <options> <source files>
����, ���ܵ�ѡ�����:
@<filename> ���ļ���ȡѡ����ļ���
-Akey[=value] ���ݸ�ע�ʹ�������ѡ��
--add-modules <�>(,<�>)*
���˳�ʼģ��֮��Ҫ�����ĸ�ģ��; ��� <module>
Ϊ ALL-MODULE-PATH, ��Ϊģ��·���е�����ģ�顣
--boot-class-path <path>, -bootclasspath <path>
�����������ļ���λ��
encoding of compiling error texts are bad, too, GBK-encoded text printed to UTF-8 console
> java .\test.java
.\test.java:7: ����: δ������쳣����FileNotFoundException; ���������в���������Ա��׳�
InputStreamReader fileReader = new InputStreamReader(new FileInputStream(new File("not exist")), StandardCharsets.UTF_8);
^
1 ������
错误: 编译失败
and i also find that runtime exception texts encoding for jdk11 is ok but for jdk17 it is bad
for jdk11 "系统找不到指定的文件" means "No such file or directory" in english
> java .\test.java
Exception in thread "main" java.io.FileNotFoundException: not exist (系统找不到指定的文件)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
at Test.main(test.java:5)
for jdk17, "绯荤粺鎵句笉鍒版寚瀹氱殑鏂囦欢銆�" is meaningless, and it seems that "绯荤粺鎵句笉鍒版寚瀹氱殑鏂囦欢銆�" is the text "系统找不到指定的文件" encoded in UTF-8 is decoded as GBK, and then the GBK-decoded text is encoded in UTF-8 and printed to the UTF-8 console
> java .\test.java
Exception in thread "main" java.io.FileNotFoundException: not exist (绯荤粺鎵句笉鍒版寚瀹氱殑鏂囦欢銆�)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
at Test.main(test.java:5)
output of java.exe -XshowSettings:properties -version for jdk11
> java.exe -XshowSettings:properties -version
Property settings:
awt.toolkit = sun.awt.windows.WToolkit
file.encoding = GBK
file.separator = \
java.awt.graphicsenv = sun.awt.Win32GraphicsEnvironment
java.awt.printerjob = sun.awt.windows.WPrinterJob
java.class.path =
java.class.version = 55.0
java.home = C:\Program Files\Microsoft\jdk-11.0.14.101-hotspot
java.io.tmpdir = C:\Users\EVERYT~1\AppData\Local\Temp\
java.library.path = C:\Program Files\Microsoft\jdk-11.0.14.101-hotspot\bin
(omitted)
.
java.runtime.name = OpenJDK Runtime Environment
java.runtime.version = 11.0.14.1+1-LTS
java.specification.name = Java Platform API Specification
java.specification.vendor = Oracle Corporation
java.specification.version = 11
java.vendor = Microsoft
java.vendor.url = https://www.microsoft.com
java.vendor.url.bug = https://github.com/microsoft/openjdk/issues
java.vendor.version = Microsoft-31205
java.version = 11.0.14.1
java.version.date = 2022-02-08
java.vm.compressedOopsMode = Zero based
java.vm.info = mixed mode
java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.name = Java Virtual Machine Specification
java.vm.specification.vendor = Oracle Corporation
java.vm.specification.version = 11
java.vm.vendor = Microsoft
java.vm.version = 11.0.14.1+1-LTS
jdk.debug = release
line.separator = \r \n
os.arch = amd64
os.name = Windows 11
os.version = 10.0
path.separator = ;
sun.arch.data.model = 64
sun.boot.library.path = C:\Program Files\Microsoft\jdk-11.0.14.101-hotspot\bin
sun.cpu.endian = little
sun.cpu.isalist = amd64
sun.desktop = windows
sun.io.unicode.encoding = UnicodeLittle
sun.java.launcher = SUN_STANDARD
sun.jnu.encoding = GBK
sun.management.compiler = HotSpot 64-Bit Tiered Compilers
sun.os.patch.level =
sun.stderr.encoding = cp65001
sun.stdout.encoding = cp65001
user.country = CN
user.dir = C:\Users\everything411
user.home = C:\Users\everything411
user.language = zh
user.name = everything411
user.script =
user.timezone =
user.variant =
openjdk version "11.0.14.1" 2022-02-08 LTS
OpenJDK Runtime Environment Microsoft-31205 (build 11.0.14.1+1-LTS)
OpenJDK 64-Bit Server VM Microsoft-31205 (build 11.0.14.1+1-LTS, mixed mode)
output of java.exe -XshowSettings:properties -version for jdk17
file.encoding = GBK
file.separator = \
java.class.path =
java.class.version = 61.0
java.home = C:\Program Files\Microsoft\jdk-17.0.3.7-hotspot
java.io.tmpdir = C:\Users\EVERYT~1\AppData\Local\Temp\
java.library.path = C:\Program Files\Microsoft\jdk-17.0.3.7-hotspot\bin
(omitted)
.
java.runtime.name = OpenJDK Runtime Environment
java.runtime.version = 17.0.3+7-LTS
java.specification.name = Java Platform API Specification
java.specification.vendor = Oracle Corporation
java.specification.version = 17
java.vendor = Microsoft
java.vendor.url = https://www.microsoft.com
java.vendor.url.bug = https://github.com/microsoft/openjdk/issues
java.vendor.version = Microsoft-32931
java.version = 17.0.3
java.version.date = 2022-04-19
java.vm.compressedOopsMode = Zero based
java.vm.info = mixed mode, sharing
java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.name = Java Virtual Machine Specification
java.vm.specification.vendor = Oracle Corporation
java.vm.specification.version = 17
java.vm.vendor = Microsoft
java.vm.version = 17.0.3+7-LTS
jdk.debug = release
line.separator = \r \n
native.encoding = GBK
os.arch = amd64
os.name = Windows 11
os.version = 10.0
path.separator = ;
sun.arch.data.model = 64
sun.boot.library.path = C:\Program Files\Microsoft\jdk-17.0.3.7-hotspot\bin
sun.cpu.endian = little
sun.cpu.isalist = amd64
sun.io.unicode.encoding = UnicodeLittle
sun.java.launcher = SUN_STANDARD
sun.jnu.encoding = GBK
sun.management.compiler = HotSpot 64-Bit Tiered Compilers
sun.os.patch.level =
sun.stderr.encoding = UTF-8
sun.stdout.encoding = UTF-8
user.country = CN
user.dir = C:\Users\everything411
user.home = C:\Users\everything411
user.language = zh
user.name = everything411
user.script =
user.variant =
openjdk version "17.0.3" 2022-04-19 LTS
OpenJDK Runtime Environment Microsoft-32931 (build 17.0.3+7-LTS)
OpenJDK 64-Bit Server VM Microsoft-32931 (build 17.0.3+7-LTS, mixed mode, sharing)
i tried Temurin JDK 18 and java and javac is ok.
> java.exe
用法:java [options] <主类> [args...]
(执行类)
或 java [options] -jar <jar 文件> [args...]
(执行 jar 文件)
或 java [options] -m <模块>[/<主类>] [args...]
java [options] --module <模块>[/<主类>] [args...]
(执行模块中的主类)
或 java [options] <源文件> [args]
(执行单个源文件程序)
> javac.exe
用法: javac <options> <source files>
其中, 可能的选项包括:
@<filename> 从文件读取选项和文件名
-Akey[=value] 传递给注释处理程序的选项
--add-modules <模块>(,<模块>)*
除了初始模块之外要解析的根模块; 如果 <module>
为 ALL-MODULE-PATH, 则为模块路径中的所有模块。
--boot-class-path <path>, -bootclasspath <path>
覆盖引导类文件的位置
> java.exe" .\test.java
.\test.java:5: 错误: 未报告的异常错误FileNotFoundException; 必须对其进行捕获或声明以便抛出
InputStreamReader fileReader = new InputStreamReader(new FileInputStream(new File("not exist")), StandardCharsets.UTF_8);
^
1 个错误
错误: 编译失败
However, runtime exception texts are still bad, the same problem as jdk17
Exception in thread "main" java.io.FileNotFoundException: not exist (绯荤粺鎵句笉鍒版寚瀹氱殑鏂囦欢銆�)
at java.base/java.io.FileInputStream.open0(Native Method)
at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
at Test.main(test.java:5)
output of java.exe -XshowSettings:properties -version for jdk18
Property settings:
file.encoding = UTF-8
file.separator = \
java.class.path =
java.class.version = 62.0
java.home = C:\Program Files\Eclipse Adoptium\jdk-18.0.1.10-hotspot
java.io.tmpdir = C:\Users\EVERYT~1\AppData\Local\Temp\
java.library.path = C:\Program Files\Eclipse Adoptium\jdk-18.0.1.10-hotspot\bin
(omitted)
.
java.runtime.name = OpenJDK Runtime Environment
java.runtime.version = 18.0.1+10
java.specification.name = Java Platform API Specification
java.specification.vendor = Oracle Corporation
java.specification.version = 18
java.vendor = Eclipse Adoptium
java.vendor.url = https://adoptium.net/
java.vendor.url.bug = https://github.com/adoptium/adoptium-support/issues
java.vendor.version = Temurin-18.0.1+10
java.version = 18.0.1
java.version.date = 2022-04-19
java.vm.compressedOopsMode = Zero based
java.vm.info = mixed mode, sharing
java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.name = Java Virtual Machine Specification
java.vm.specification.vendor = Oracle Corporation
java.vm.specification.version = 18
java.vm.vendor = Eclipse Adoptium
java.vm.version = 18.0.1+10
jdk.debug = release
line.separator = \r \n
native.encoding = GBK
os.arch = amd64
os.name = Windows 11
os.version = 10.0
path.separator = ;
sun.arch.data.model = 64
sun.boot.library.path = C:\Program Files\Eclipse Adoptium\jdk-18.0.1.10-hotspot\bin
sun.cpu.endian = little
sun.cpu.isalist = amd64
sun.io.unicode.encoding = UnicodeLittle
sun.java.launcher = SUN_STANDARD
sun.jnu.encoding = GBK
sun.management.compiler = HotSpot 64-Bit Tiered Compilers
sun.os.patch.level =
sun.stderr.encoding = UTF-8
sun.stdout.encoding = UTF-8
user.country = CN
user.dir = C:\Users\everything411
user.home = C:\Users\everything411
user.language = zh
user.name = everything411
user.script =
user.variant =
openjdk version "18.0.1" 2022-04-19
OpenJDK Runtime Environment Temurin-18.0.1+10 (build 18.0.1+10)
OpenJDK 64-Bit Server VM Temurin-18.0.1+10 (build 18.0.1+10, mixed mode, sharing)
Digression: Why do you want to enable this beta utf-8 option?
@imba-tjd linux and macos both set the default encoding to utf8. i need to share source codes with chinese characters between my windows machine and linux machine (wsl1 and wsl2 use utf-8, too).
Did you tried to input Chinese from stdin? Try this
System.out.println(new Scanner(System.in).nextLine());
You will find that it fails to read, if you enabled the beta utf8.
I have also noticed this bug before. infact it not only affects java, but for C scanf, C++ cin, C# Console.Readline, they all don't accept chinese when utf8 enabled.
I believe that this is a windows console related bug instead of the language runtime. see https://docs.microsoft.com/zh-cn/windows/console/classic-vs-vt and https://github.com/microsoft/terminal/issues/7777
The issue https://bugs.openjdk.org/browse/JDK-8272352 might be relevant here; it was backported to OpenJDK 11.0.17 and Java 17.0.5 quite recently.
(Just passing by... I saw this thread as I was fixing Unicode problems in the NetBeans IDE.)
@everything411 Are you able to try with our latest 17.0.5 build? As @eirikbakke mentions, the upstream issue seems to be fixed.
@karianna I can confirm that all bugs I reported here no longer exist now.