rack-plugin-toolchain icon indicating copy to clipboard operation
rack-plugin-toolchain copied to clipboard

Cross-compiled plugins crash Rack on Windows platform

Open cschol opened this issue 9 months ago • 9 comments

Report that cross-compiled plugins crash on Windows platform with latest toolchain updates.

Update: Confirmed on my Windows test system. Investigating.

Mingw compiler version was updated from 13.1.0 to 14.2.0 in latest toolchain update.

Backtrace from user test:

#0  0x00007fffe3a3730f in ntdll!RtlRegisterSecureMemoryCacheCallback () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x00007fffe3a02197 in ntdll!EtwLogTraceEvent () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x00007fffe3a35678 in ntdll!RtlRegisterSecureMemoryCacheCallback () from C:\WINDOWS\SYSTEM32\ntdll.dll
#3  0x00007fffe39ed126 in ntdll!EtwLogTraceEvent () from C:\WINDOWS\SYSTEM32\ntdll.dll
#4  0x00007fffe396c334 in ntdll!RtlGetCurrentServiceSessionId () from C:\WINDOWS\SYSTEM32\ntdll.dll
#5  0x00007fffe396b001 in ntdll!RtlFreeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll
#6  0x00007fffe0ef364b in ucrtbase!_free_base () from C:\WINDOWS\System32\ucrtbase.dll
#7  0x00007fff594ad44d in plugin!_ZN11ScopeWidgetC1EP5Scope () from D:\msys64\home\Bloodbat\rack\plugins\Fundamental\plugin.dll
#8  0x00007fff595234a4 in plugin!_ZZN4rack11createModelI5Scope11ScopeWidgetEEPNS_6plugin5ModelENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEN6TModel18createModuleWidgetEPNS_6engine6ModuleE () from D:\msys64\home\rack\plugins\Fundamental\plugin.dll
#9  0x00007fff4cc2018b in rack::app::RackWidget::fromJson (this=0x27b9b50, rootJ=rootJ@entry=0x2833ba0) at src/app/RackWidget.cpp:376
#10 0x00007fff4cbceac4 in rack::patch::Manager::fromJson (this=this@entry=0x27c6460, rootJ=rootJ@entry=0x2833ba0) at src/patch.cpp:554
#11 0x00007fff4cbcf308 in rack::patch::Manager::loadAutosave (this=this@entry=0x27c6460) at src/patch.cpp:381
#12 0x00007fff4cbd1ae0 in rack::patch::Manager::launch (this=0x27c6460, pathArg=...) at src/patch.cpp:79
#13 0x00007ff624372d68 in main (argc=2, argv=0x14fd80) at adapters/standalone.cpp:259
Continuing.
warning: HEAP[Rack.exe]: 
warning: Invalid address specified to RtlFreeHeap( 0000000000580000, 000000000A8D7E50 )

Backtrace from log from my test:

[6.937 fatal adapters/standalone.cpp:49 fatalSignalHandler] Fatal signal 11. Stack trace:
34:  0x0
33:  0x0
32: _C_specific_handler 0x7ffef72c7f60
31: _chkstk 0x7ffef8f127a0
30: RtlRaiseException 0x7ffef8ec20d0
29: KiUserExceptionDispatcher 0x7ffef8f113a0
28: memset 0x7ffef8f14600
27: RtlFreeHeap 0x7ffef8e94760
26: free_base 0x7ffef651f040
25: ZN9VCOWidgetC1EP3VCO 0x7ffebe4a9380
24: ZZN4rack11createModelI3VCO9VCOWidgetEEPNS_6plugin5ModelENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEN6TModel18createModuleWidgetEPNS_6engine6ModuleE 0x7ffebe4bf880
23: ZN4rack3app7browser8ModelBox4drawERKNS_6widget6Widget8DrawArgsE 0x7ffebc8bd0c0
22: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
21: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
20: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
19: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
18: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
17: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
16: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
15: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
14: ZN4rack2ui12ScrollWidget4drawERKNS_6widget6Widget8DrawArgsE 0x7ffebc2625a0
13: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
12: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
11: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
10: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
9: ZN4rack6widget6Widget9drawChildEPS1_RKNS1_8DrawArgsEi 0x7ffebc26bc80
8: ZN4rack6widget6Widget4drawERKNS1_8DrawArgsE 0x7ffebc26bdf0
7: ZN4rack6window6Window4stepEv 0x7ffebc26e770
6: ZN4rack6window6Window3runEv 0x7ffebc26f1a0
5: ZN4rack6window6Window3runEv 0x7ffebc26f1a0
4: ZN4rack6window6Window3runEv 0x7ffebc26f1a0
3: ZN4rack6window6Window3runEv 0x7ffebc26f1a0
2: ZN4rack6window6Window3runEv 0x7ffebc26f1a0
1: BaseThreadInitThunk 0x7ffef8a07360
0: RtlUserThreadStart 0x7ffef8ebcc70

cschol avatar Mar 30 '25 21:03 cschol

If you don’t mind a total guess

  • browser shows with module null
  • wonder if a code path does delete null (which is valid)
  • and gcc 14 on windows has a runtime bug or missing feature

But to emphasize this is just a total guess.

baconpaul avatar Mar 31 '25 16:03 baconpaul

* and gcc 14 on windows has a runtime bug or missing feature

I compile my local Windows builds with GCC 14 (14.2.0 to be precise; but I've upgraded with every release from 14.1.0, the first in the series from MSYS, I believe) and I have never had that problem, until I tried it with the toolchain.

Bloodbat avatar Mar 31 '25 16:03 Bloodbat

Ok! So gcc14 on win for win works.

From reading your stack and cschol my first thought was can’t confirm the memory to delete hence my thought.

Does your 14 build use the same libc as the cross build in tool chain? That’s my next thought.

Again in total guesses. If it’s more useful to shut up just tell me!

baconpaul avatar Mar 31 '25 21:03 baconpaul

I can't say for sure, but it's unlikely: my local builds use msys' february mingw; the toolkit uses a january commit of crosstool-ng.

Maybe... just maybe, updating crosstool to an stable version will fix the problem... I'll give it a shot and report.

Bloodbat avatar Mar 31 '25 22:03 Bloodbat

Maybe... just maybe, updating crosstool to an stable version will fix the problem... I'll give it a shot and report.

I updated to the recently released 1.27 version of crosstool-ng yesterday and it did not fix the issue unfortunately.

cschol avatar Mar 31 '25 23:03 cschol

I don't know if this is helpful or useful; but the previous crosstool version 1.26.0, with GCC 13.2 builds a working plugin.

Bloodbat avatar Apr 01 '25 08:04 Bloodbat

This is likely due to Rack creating a std::string, giving it to a plugin (when a plugin calls asset::plugin() or Module::getPatchStorageDirectory() or something), and the plugin deleting it with a different libstdc++ implementation.

Plugins are compiled with -static-libstdc++ on Windows, so I could just remove this flag, which would make plugins link to Rack's libstdc++. This is not ideal, because you can't in general compile for one libstdc++ version and then link to a different one. But you also can't in general pass C++ objects (such as std::string) between different libstdc++ versions. So there is no perfect solution for allowing different libstdc++ ABIs.

Rack 3 will solve this problem by only exposing a C ABI, so plugins will work for decades. But to solve this issue for Rack 2, I think I'll make newly compiled plugins dynamically link to Rack's libstdc++. I've tested this and it seems to build plugins that work fine with the last few Rack production builds.

AndrewBelt avatar Apr 03 '25 04:04 AndrewBelt

Thank you for the insights :)

Looking forward to the Rack 3 implementation.

In the mean time, would the removal of the -static-libstdc++ flag be transparent for developers and, most importantly, users? Should developers take special provisions to accommodate the change?

Bloodbat avatar Apr 03 '25 06:04 Bloodbat

But to solve this issue for Rack 2, I think I'll make newly compiled plugins dynamically link to Rack's libstdc++. I've tested this and it seems to build plugins that work fine with the last few Rack production builds.

I've tested this with my plugin by filtering -static-libstdc++ from LDFLAGS, and it fails to load in the official release build of Rack 2.6.3 from the website. I tested it both with local builds (which worked with the self-built Rack they were built against) and with the toolchain via CI, both failed to load with code 127:

[0.234 info src/plugin.cpp:133 loadPlugin] Loading plugin from C:/Users/Chronos/AppData/Local/Rack2/plugins-win-x64/OuroborosModules
[0.235 warn src/plugin.cpp:203 loadPlugin] Could not load plugin C:/Users/Chronos/AppData/Local/Rack2/plugins-win-x64/OuroborosModules: Failed to load library C:/Users/Chronos/AppData/Local/Rack2/plugins-win-x64/OuroborosModules/plugin.dll: code 127

(For the record, I hacked together a CMake-less Makefile for my plugin to be sure the load failure wasn't caused by the non-standard compilation method)

Doom2fan avatar Apr 12 '25 22:04 Doom2fan

@Doom2fan Can you post your plugin source code that fails to load when the -static-libstdc++ flag is removed?

AndrewBelt avatar Apr 16 '25 04:04 AndrewBelt

@Doom2fan Can you post your plugin source code that fails to load when the -static-libstdc++ flag is removed?

https://github.com/Doom2fan/OuroborosModules/tree/MakeOnly This is the plugin, using a plain Makefile. Adding

LDFLAGS := $(filter-out -static-libstdc++,$(LDFLAGS))

at the end makes the builds not load in the official 2.6.3 build for me.

Doom2fan avatar Apr 16 '25 05:04 Doom2fan

Curiously, I can't reproduce it with local builds anymore, only the toolchain ones... The original report here is for the toolchain, though, so maybe reproducing it locally was a fluke. I've got a CMake-based toolchain build without -static-libstdc++ here: https://github.com/Doom2fan/OuroborosModules/actions/runs/14413980936 And a Makefile-based one here: https://github.com/Doom2fan/OuroborosModules/actions/runs/14485426368

For completeness, I got a log from WinDbg with loader traces from the latter build: (The error is identical between the two, though)

23b0:27c0 @ 144198312 - LdrLoadDll - ENTER: DLL name: C:/Users/Chronos/AppData/Local/Rack2/plugins-win-x64/OuroborosModules/plugin.dll
23b0:27c0 @ 144198312 - LdrpLoadDllInternal - ENTER: DLL name: C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll
23b0:27c0 @ 144198312 - LdrpResolveDllName - ENTER: DLL name: C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll
23b0:27c0 @ 144198312 - LdrpResolveDllName - RETURN: Status: 0x00000000
23b0:27c0 @ 144198312 - LdrpMinimalMapModule - ENTER: DLL name: C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll
ModLoad: 00007ff8`e58d0000 00007ff8`e5a45000   C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll
23b0:27c0 @ 144198312 - LdrpMinimalMapModule - RETURN: Status: 0x00000000
23b0:27c0 @ 144198312 - LdrpFindDllActivationContext - INFO: Probing for the manifest of DLL "C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll" failed with status 0xc0000089
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-convert-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-environment-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-filesystem-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-heap-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-locale-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-math-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-private-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-runtime-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-stdio-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-string-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpPreprocessDllName - INFO: DLL api-ms-win-crt-time-l1-1-0.dll was redirected to C:\Windows\SYSTEM32\ucrtbase.dll by API set
23b0:27c0 @ 144198312 - LdrpNameToOrdinal - WARNING: Procedure "_ZSt21ios_base_library_initv" could not be located in DLL at base 0x00007FF973950000.
23b0:27c0 @ 144198312 - LdrpReportError - ERROR: Locating export "_ZSt21ios_base_library_initv" for DLL "C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll" failed with status: 0xc0000139.
(23b0.27c0): Unknown exception - code c0000139 (first chance)
23b0:27c0 @ 144198328 - LdrpGenericExceptionFilter - ERROR: Function LdrpSnapModule raised exception 0xc0000139
    Exception record: .exr 000000D2F0B0E370
    Context record: .cxr 000000D2F0B0DE80
23b0:27c0 @ 144198328 - LdrpProcessWork - ERROR: Unable to load DLL: "C:\Users\Chronos\AppData\Local\Rack2\plugins-win-x64\OuroborosModules\plugin.dll", Parent Module: "(null)", Status: 0xc0000139
23b0:27c0 @ 144198328 - LdrpLoadDllInternal - RETURN: Status: 0xc0000139
23b0:27c0 @ 144198328 - LdrLoadDll - RETURN: Status: 0xc0000139
(23b0.27c0): Unknown exception - code 20474343 (first chance)

Doom2fan avatar Apr 16 '25 05:04 Doom2fan

I built a version of my test plugin for Windows without the -static-libstdc++ flag using Docker and, while Rack no longer crashes when opening the browser, the test plugin fails to load with code 127:

[0.064 info src/plugin.cpp:133 loadPlugin] Loading plugin from D:/msys64/home/Bloodbat/rack/plugins/SanguineTest
[0.065 warn src/plugin.cpp:203 loadPlugin] Could not load plugin D:/msys64/home/Bloodbat/rack/plugins/SanguineTest: Failed to load library D:/msys64/home/Bloodbat/rack/plugins/SanguineTest/plugin.dll: code 127

This happens both with the official Rack 2.6.3 build and my local one.

Bloodbat avatar Apr 16 '25 07:04 Bloodbat

I've noticed that the toolchain's local/x86_64-w64-mingw32/x86_64-w64-mingw32/sysroot/lib/libstdc++-6.dll (20MB) has a bunch of std::ios* functions like ios_base_library_init(), but none of them are present in libstdc++-6.dll from https://packages.msys2.org/packages/mingw-w64-x86_64-gcc-libs (2.3MB) which is bundled with Rack. Perhaps all I need to do is include a more complete libstdc++.

AndrewBelt avatar Apr 16 '25 09:04 AndrewBelt

If I'm understanding the GCC libstdc++ documentation correctly, the issue might be one of backwards incompatibility. From what I understood, programs and libraries built against an earlier version of libstdc++ should be ABI-compatible with a newer one as long as the major version is the same, though not vice versa. So theoretically, updating Rack itself to the newer version would fix it... But I imagine doing so might instead cause problems with plugins compiled for the older library with the different std::string implementation that caused the original issue reported here.

Doom2fan avatar Apr 16 '25 16:04 Doom2fan

My local build is compiled using GCC 14.2.0-3; looking at the official build's strings, Rack is built using GCC 14.2.0-2... toolchain built plugins crash both; plugins built using my local MSYS GCC 14.2.0-3 crash neither (nor did they crash the old 2.5.2); the toolchain uses its own signature crosstool-NG GCC, I can't shake the feeling the problem lies with crosstool-NG.

As I mentioned above, downgrading crosstool-NG to the one with GCC 13.2 produces plugins that crash neither my local build nor VCV's official one.

Bloodbat avatar Apr 16 '25 17:04 Bloodbat

I believe this is now fixed with https://github.com/VCVRack/rack-plugin-toolchain/commit/811a44c4af4621ddd339d60dd3ae8da9d5cb555d by downgrading mingw-w64 to v10. Several ABI changes were made in v12, and possibly v11. https://www.mingw-w64.org/changelog/

I am also reverting the change from statically linking libstdc++ to shared on Windows. I think as long as the libstdc++ ABI doesn't change (which is unlikely for C++11 implementations), two different libstdc++ versions can pass objects (such as std::string and std::map) between each other with no problem. So I think Rack 2 will permanently use mingw-w64 v10. Rack 3 will likely release in 1-2 years, and since it will use a C ABI, plugin developers will be able to statically link whatever libstdc++ they want and use whatever C/C++/Rust/Go/etc compiler version they want, as long as it supports the cdecl calling convention and meets a minimum recommended OS version.

I'll rebuild and test several plugins today.

AndrewBelt avatar Apr 17 '25 22:04 AndrewBelt

I just tested building a plugin with the fix, and I'm happy to report Rack did not crash :)

Thanks :)

Bloodbat avatar Apr 18 '25 01:04 Bloodbat

Toolchain appears to be fixed with mingw-w64 10.0.0. Plugins do not crash or fail to load on Windows.

AndrewBelt avatar Apr 18 '25 19:04 AndrewBelt