New WASI thread doesn't start running, when main thread waits
I'm trying to figure out how to start multiple threads and syncronize them with a mutex that uses atomics and wait/notify.
I made a minimal example that does not work:
Main thread spawns a new thread, then waits for u32 at offset 0 to change from 0 to something else.
New thread executes unreachable, to see if it runs at all.
Currently, the wasmtime process runs infinitely, and doesn't reach unreachable. Its memory consumption is always increasing.
;; wasmtime --wasi threads=y wasi_threads_small.wat
(module $wasi_threads_small
(type (;0;) (func (param i32) (result i32)))
(type (;1;) (func))
(type (;2;) (func (param i32 i32)))
(import "wasi" "thread-spawn" (func $thread_spawn (type 0)))
(func $_start (type 1)
i32.const 0
call $thread_spawn
drop
i32.const 0
i32.const 0
i64.const -1
memory.atomic.wait32
unreachable)
(func $wasi_thread_start (type 2) (param i32 i32)
unreachable ;; this is not getting triggered
)
(memory (;0;) 17 1024 shared)
(export "_start" (func $_start))
(export "wasi_thread_start" (func $wasi_thread_start))
(start $_start))
I'm testing on Windows 10, with latest version wasmtime-cli 21.0.1 (cedf9aa0f 2024-05-22)
The issue is the (start $_start) you wrote. This will cause the _start function to be called every time the module is initialized before anything else runs. wasi-threads initializes a completely new instance of the module for every spawned thread. Sharing nothing but the linear memory it imports. As such every thread will try to spawn a new thread and wait indefinitively before wasi_thread_start can be reached. Eventually this will cause you to run out of memory. The solution is to remove the (start $_start). Wasi will already call an exported function called _start on the main thread and the main thread only.
Thanks, that was important bit of information.
The rest of my problems seem to come from lack of __stack_pointer setup in the new thread when using D compiler. I need to replicate this code in my executable somehow: https://github.com/WebAssembly/wasi-threads/blob/main/test/testsuite/wasi_thread_spawn.S. Is clang required to compile this .s file?
Am I correct that stack grows to lower addresses and that globals are instanced per thread?
Is clang required to compile this .s file?
Yes, this file uses uses a syntax which only clang uses.
Am I correct that stack grows to lower addresses
Yes, with LLVM the stack grows down. If you are writing your own language which compiles to wasm which doesn't need to interoperate with C, you are free to make the stack grow up if you want though.
and that globals are instanced per thread?
Yes, wasm globals are instanced per thread too.
I have progress:
- Installed fresh clang from https://github.com/llvm/llvm-project/releases/tag/llvmorg-18.1.6 via
LLVM-18.1.6-win64.exe - Compiled the asm file with
clang wasi_thread_spawn.S -c --target=wasm32-wasi -o wasi_thread_spawn.wasm - To link I pass
wasi_thread_spawn.wasmas a linker argument via-Lwasi_thread_spawn.wasm - Now I can pass new stack pointer to
thread-spawn, and thewasi_thread_startfunction sets the__stack_pointer - After that
__wasi_thread_start_Cis called, which is defined in the D code.
Here where the next problem is:
- Max memory size is set to 64MiB with
-L--max-memory=67108864 - Memory looks like
(memory (;0;) 17 1024 shared) - Initial memory size is
0x110000 - I increase it by 1MiB + 1 page to
0x220000in the main thread - New thread is spawned, stack pointer is set to
0x210000 - Invalid memory acces in the new thread occurs:
Error: error while executing at wasm backtrace:
0: 0x2ad0 - testsuite.wasm!__wasi_thread_start_C
1: 0x2b6e - testsuite.wasm!wasi_thread_start
Caused by:
0: memory fault at wasm address 0x20fff0 in linear memory of size 0x110000
1: wasm trap: out of bounds memory access
Which makes me think that new thread is not seeing increased memory size for some reason.
I figured it out:
- First I tried an example program once more
- I found that the difference was in the memory definition, memory is both imported and exported at the same time in the example:
(memory (export "memory") (import "foo" "bar") 1 1 shared) - My code only had export, no import:
(memory (;0;) 34 1024 shared) (export "memory" (memory 0)) - So I passed
--import-memorywasm-ldlinker flag to import memory. - Passing
--import-memorydisables exporting, so I had to pass--export-memoryexplicitly too. - Now my program contains:
(import "env" "memory" (memory (;0;) 34 1024 shared)) (export "memory" (memory 0))
Now everything works as expected. I didn't find where documentation mentions the neccesity of marking memory as imported in addition to exporting and sharing.
I didn't find where documentation mentions the neccesity of marking memory as imported in addition to exporting and sharing.
It is documented that imported objects (including memories) will be shared between threads: https://github.com/WebAssembly/wasi-threads/blob/6b4e2e50a3929d9ebc3b1a54e36ab6f0c0ebc677/README.md?plain=1#L141-L142 It is not explicitly stated that you have to import the memory to behave the way most people would expect.
Was able to remove this extra .wasm file from compilation by using inline assembly:
// When wasi::thread-spawn is called, the WASM runtime calls wasi_thread_start in a new thread
// It sets __stack_pointer of a newly spawned thread and calls __wasi_thread_start_user
// It is important to mark this as naked, otherwise compiler may insert
// a read from __stack_pointer before it gets initialized
@naked export extern(C) void wasi_thread_start(i32 tid, Thread* thread) {
import ldc.llvmasm;
__asm("
local.get 1 # Thread* thread
i32.load 0#offset # thread.stackPtr
global.set __stack_pointer # __stack_pointer = thread.stackPtr
local.get 0 # tid
local.get 1 # start_arg
call __wasi_thread_start_user
return", "");
// compiler adds unreachable after call
// add return to avoid hitting it
}
Other than that piece of code I needed to add 5 extra flags in total to make threads work in D: Compiler flags:
-
-mattr=+bulk-memory,+atomics
Linker flags:
-
--shared-memory -
--import-memory -
--export-memory -
--max-memory=67108864
I will close this issue, since everything works. Thanks for your help.