wasmtime icon indicating copy to clipboard operation
wasmtime copied to clipboard

New WASI thread doesn't start running, when main thread waits

Open MrSmith33 opened this issue 1 year ago • 6 comments

I'm trying to figure out how to start multiple threads and syncronize them with a mutex that uses atomics and wait/notify.

I made a minimal example that does not work: Main thread spawns a new thread, then waits for u32 at offset 0 to change from 0 to something else.
New thread executes unreachable, to see if it runs at all.
Currently, the wasmtime process runs infinitely, and doesn't reach unreachable. Its memory consumption is always increasing.

;; wasmtime --wasi threads=y wasi_threads_small.wat
(module $wasi_threads_small
  (type (;0;) (func (param i32) (result i32)))
  (type (;1;) (func))
  (type (;2;) (func (param i32 i32)))
  (import "wasi" "thread-spawn" (func $thread_spawn (type 0)))
  (func $_start (type 1)
    i32.const 0
    call $thread_spawn
    drop

    i32.const 0
    i32.const 0
    i64.const -1
    memory.atomic.wait32

    unreachable)
  (func $wasi_thread_start (type 2) (param i32 i32)
    unreachable ;; this is not getting triggered
  )
  (memory (;0;) 17 1024 shared)
  (export "_start" (func $_start))
  (export "wasi_thread_start" (func $wasi_thread_start))
  (start $_start))

I'm testing on Windows 10, with latest version wasmtime-cli 21.0.1 (cedf9aa0f 2024-05-22)

MrSmith33 avatar May 28 '24 16:05 MrSmith33

The issue is the (start $_start) you wrote. This will cause the _start function to be called every time the module is initialized before anything else runs. wasi-threads initializes a completely new instance of the module for every spawned thread. Sharing nothing but the linear memory it imports. As such every thread will try to spawn a new thread and wait indefinitively before wasi_thread_start can be reached. Eventually this will cause you to run out of memory. The solution is to remove the (start $_start). Wasi will already call an exported function called _start on the main thread and the main thread only.

bjorn3 avatar May 28 '24 16:05 bjorn3

Thanks, that was important bit of information.

The rest of my problems seem to come from lack of __stack_pointer setup in the new thread when using D compiler. I need to replicate this code in my executable somehow: https://github.com/WebAssembly/wasi-threads/blob/main/test/testsuite/wasi_thread_spawn.S. Is clang required to compile this .s file?

Am I correct that stack grows to lower addresses and that globals are instanced per thread?

MrSmith33 avatar May 28 '24 19:05 MrSmith33

Is clang required to compile this .s file?

Yes, this file uses uses a syntax which only clang uses.

Am I correct that stack grows to lower addresses

Yes, with LLVM the stack grows down. If you are writing your own language which compiles to wasm which doesn't need to interoperate with C, you are free to make the stack grow up if you want though.

and that globals are instanced per thread?

Yes, wasm globals are instanced per thread too.

bjorn3 avatar May 28 '24 19:05 bjorn3

I have progress:

  • Installed fresh clang from https://github.com/llvm/llvm-project/releases/tag/llvmorg-18.1.6 via LLVM-18.1.6-win64.exe
  • Compiled the asm file with clang wasi_thread_spawn.S -c --target=wasm32-wasi -o wasi_thread_spawn.wasm
  • To link I pass wasi_thread_spawn.wasm as a linker argument via -Lwasi_thread_spawn.wasm
  • Now I can pass new stack pointer to thread-spawn, and the wasi_thread_start function sets the __stack_pointer
  • After that __wasi_thread_start_C is called, which is defined in the D code.

Here where the next problem is:

  • Max memory size is set to 64MiB with -L--max-memory=67108864
  • Memory looks like (memory (;0;) 17 1024 shared)
  • Initial memory size is 0x110000
  • I increase it by 1MiB + 1 page to 0x220000 in the main thread
  • New thread is spawned, stack pointer is set to 0x210000
  • Invalid memory acces in the new thread occurs:
Error: error while executing at wasm backtrace:
    0: 0x2ad0 - testsuite.wasm!__wasi_thread_start_C
    1: 0x2b6e - testsuite.wasm!wasi_thread_start
Caused by:
    0: memory fault at wasm address 0x20fff0 in linear memory of size 0x110000
    1: wasm trap: out of bounds memory access

Which makes me think that new thread is not seeing increased memory size for some reason.

MrSmith33 avatar May 29 '24 11:05 MrSmith33

I figured it out:

  • First I tried an example program once more
  • I found that the difference was in the memory definition, memory is both imported and exported at the same time in the example:
    (memory (export "memory") (import "foo" "bar") 1 1 shared)
    
  • My code only had export, no import:
    (memory (;0;) 34 1024 shared)
    (export "memory" (memory 0))
    
  • So I passed --import-memory wasm-ld linker flag to import memory.
  • Passing --import-memory disables exporting, so I had to pass --export-memory explicitly too.
  • Now my program contains:
    (import "env" "memory" (memory (;0;) 34 1024 shared))
    (export "memory" (memory 0))
    

Now everything works as expected. I didn't find where documentation mentions the neccesity of marking memory as imported in addition to exporting and sharing.

MrSmith33 avatar May 29 '24 17:05 MrSmith33

I didn't find where documentation mentions the neccesity of marking memory as imported in addition to exporting and sharing.

It is documented that imported objects (including memories) will be shared between threads: https://github.com/WebAssembly/wasi-threads/blob/6b4e2e50a3929d9ebc3b1a54e36ab6f0c0ebc677/README.md?plain=1#L141-L142 It is not explicitly stated that you have to import the memory to behave the way most people would expect.

bjorn3 avatar May 29 '24 18:05 bjorn3

Was able to remove this extra .wasm file from compilation by using inline assembly:

// When wasi::thread-spawn is called, the WASM runtime calls wasi_thread_start in a new thread
// It sets __stack_pointer of a newly spawned thread and calls __wasi_thread_start_user
// It is important to mark this as naked, otherwise compiler may insert
// a read from __stack_pointer before it gets initialized
@naked export extern(C) void wasi_thread_start(i32 tid, Thread* thread) {
	import ldc.llvmasm;
	__asm("
		local.get 1                # Thread* thread
		i32.load  0#offset         # thread.stackPtr
		global.set __stack_pointer # __stack_pointer = thread.stackPtr
		local.get 0                # tid
		local.get 1                # start_arg
		call __wasi_thread_start_user
		return", "");
		// compiler adds unreachable after call
		// add return to avoid hitting it
}

Other than that piece of code I needed to add 5 extra flags in total to make threads work in D: Compiler flags:

  • -mattr=+bulk-memory,+atomics

Linker flags:

  • --shared-memory
  • --import-memory
  • --export-memory
  • --max-memory=67108864

I will close this issue, since everything works. Thanks for your help.

MrSmith33 avatar May 31 '24 16:05 MrSmith33