zig icon indicating copy to clipboard operation
zig copied to clipboard

Support WebAssembly Reference Types

Open leroycep opened this issue 4 years ago • 11 comments

WebAssembly Reference Types are supported in most WebAssembly runtimes at the moment, and they make it easier to interoperate with the host runtime.

On the Discord: https://discord.com/channels/605571803288698900/

Stephen Solka#3548 Does zig's wasm target support reference objects? https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md I tried to figure it out by searching the code base for the code to declare these types externref. I hit this commit that upstreamed this external linker https://github.com/ziglang/zig/commit/f56ae69edd8c96a5f6525f20bf0a22704a826f00 landing in 0.9.0 its not clear if this is exposed at the language level to be used by people using zig for wasm. Im trying to figure out the "right way" to pass JS objects to zig wasm.

Stephen Solka#3548 This is rust's bindgen ref types implementation https://rustwasm.github.io/wasm-bindgen/reference/reference-types.html

Later in the thread: https://discord.com/channels/605571803288698900/922695973623443466/927281011618873407

@Luukdegram Hmm, I'm afraid there's no such thing yet. A lot of the stuff is currently in my head, as I have to implement it for the wasm backend anyway. As LLVM does support this, we could support this once the llvm backend of the selfhosted compiler is finished, which is targeted for 0.10.0. We will have to implement the wasm-specific address spaces though, so that will probably be after 0.10.0.

A roadmap in general is probably a good idea, but the selfhosted compiler is in such a high-speed development stage right now, I'd prefer to wait a bit until we have a more solid base. I'll add this point to my personal TODO 😉

As mentioned by Luuk, this feature will need address spaces to be implemented (see #653).

For now we can pass in reference as integer handles to index into an array or a hashmap.

leroycep avatar Jan 02 '22 22:01 leroycep

@leroycep @kubkon how you expecting this will be looks like in zig?

const Externref = *opaque{}; // like this?
const Externref = anytype; // or this?
// or another syntax?

I am tried this and it's not working right now. Even if externref exist in wasm.zig, it not exist in codegen.zig/air.zig/etc. So basically it's not used right now. However it's exist in LLVM and adding reference types support looks pretty easy.

Here a minimal example that should print WebGLProgram in browser console (right now it's printing 0 if you will run in console npm run build && npm start).

//gl.zig

pub const Externref = *opaque{};

pub const GLenum = u32;
pub const WebGLShader = Externref;
pub const DOMString = [*]const u8;
pub const VERTEX_SHADER: GLenum = 0x8B31;

pub extern "gl" fn createShader(t: GLenum) WebGLShader;
//console.zig

const gl = @import("gl.zig");

pub extern "console" fn log(_: gl.WebGLShader) void;
pub extern "console" fn logF(_: f32) void;
pub extern "console" fn logI(_: c_int) void;
//main.zig

const console = @import("console.zig");
const gl = @import("gl.zig");

export fn main() i32 {

  console.logI(123);
  console.log(gl.createShader(gl.FRAGMENT_SHADER));

  return 0;
}
<!--index.html-->

<!DOCTYPE html>
<html>
<head>
  <link rel="icon" href="data:;base64,iVBORw0KGgo=">
  <title>Test</title>
</head>
<body>
  <canvas id="c"></canvas>
  <script type="module" src="main.js"></script>
</body>
</html>
//main.js

const canvas = document.getElementById('c');
const gl = canvas.getContext('webgl');

const imports = {
  console: {
    log(r) { return console.log(r) },
    logI(i) { return console.log(i) }
  },
  gl: { createShader(t) { return gl.createShader(t); }}
}

WebAssembly.instantiateStreaming(fetch('../main.wasm'), imports).then(obj => {
  const wasm = obj.instance.exports;
  wasm.main();
})
//package.json

{
  "scripts": {
    "build": "zig build-lib main.zig -target wasm32-freestanding -dynamic -OReleaseSmall",
    "start": "npx servez"
  },
  "devDependencies": {
    "servez": "^1.12.1"
  }
}

munrocket avatar Feb 03 '22 17:02 munrocket

The plan is to use address spaces (issue #653) for WASM externrefs. That would look something like this:

// gl.zig

// `.webref` is just a random name I chose, not likely to be the actual thing
pub extern "gl" fn createShader(t: GLenum)  *addrspace(.webref) WebGLShader;
const WebGLShader = opaque{
    pub extern "gl" fn shaderSource(this: *addrspace(.webref) WebGLShader, source: [*]const u8, sourceLen: usize) void;
};
// main.zig

const gl = @import("gl.zig");

const SHADER_SOURCE =
    \\ very clever fragment shader here
;

export fn main() i32 {
  const shader = gl.createShader(gl.FRAGMENT_SHADER);
  shader.shaderSource(SHADER_SOURCE.ptr, SHADER_SOURCE.len);

  return 0;
}

The address space proposal hasn't been finalized, far as I can tell, so it will end up looking a bit different from this.

leroycep avatar Feb 04 '22 01:02 leroycep

Ah I see, you right @leroycep seems that we need addrspace to store externref in global variables, but they not supported right now #4866. Anyway maybe it's possible to store it in table? Here different examples with reference types.

global.wat table_get.wat table_set.wat

WDYT?

munrocket avatar Feb 04 '22 15:02 munrocket

@Luukdegram what do you think?

Think of what, exactly? I see a lot of noise here, but no concrete idea of how you want to solve this. There's a lot to consider to fully support this use case:

  • Linking with C libraries - When building Zig code with Wasm as a target, you may want to link with existing C libraries. This means we must generate object files that support such use cases. Globals of type Opaque{} (or anyopaque for that matter) will generate a symbol for the Data section. However, externref symbols belong in the table section. This means that we cannot re-use the exact same syntax for both use cases, as they are incompatible with each other, as for wasm the symbols are typed and the linker will reject them when they resolve to incompatible types.
  • To generate the correct object files, we need to tell LLVM how to emit those. Currently, we do not tell LLVM at all that we want an externref, rather than a data symbol.
  • As mentioned above, we cannot re-use the same syntax, so a decision must be made on how we want to represent this use case using Zig's syntax. Some quick examples could be:
    • Using addressspaces: extern var foo: *anyopaque (.externref);
    • When defining the library name as non-C such as: extern "MyWasmEnvironment" var foo = opaque{};

For the LLVM backend, we can then emit whatever it wants, and do our own thing in the wasm backend. As long as they generate semantically correct behavior we want.

Don't get me wrong. I fully support this use case and would like to see this supported in Zig, but it's not as simple as you seem to portray. I don't think we should rush support for this and should carefully consider all options. Personally, it isn't high on my TODO list right now, as stage2 is far along and I'd like to support Wasm's MVP in the wasm backend before considering the additional proposals and features.

Also, note that I'm not part of the core team. While I can and will provide my input to the core team, I'm in no position to make a decision on this.

Luukdegram avatar Feb 04 '22 19:02 Luukdegram

Sorry, I am just tried to fix it by myself (was little bit naive here) and also attached an example that somebody can use as a reference test for implementation.

Use case: I want to make web engine like three.js, that's why I need to make fully compatibe WebAPI for audio, graphics (including new backends) and mouse events. I will do it with codegenerator that can be reused later for another APIs in another zig projects. Linking with C libraries not in a first priority, because right now ecosystem and tooling is more important. For example we also need manually create a glue for fetch/SetTimeout/reqeustAnimationFrame/performance.now().

So the reasons why I am considering Reference Types in zig:

  • with it JS glue become much smaller and whole application faster, because it will be almost a native call to a browser.
  • reference types already supported by all browsers https://webassembly.org/roadmap/

For those who trying to implement glue in old style it will be a x6 more work and will become legacy later.

I fully support this use case and would like to see this supported in Zig, but it's not as simple as you seem to portray.

@Luukdegram thank you for detailed response, you 100% right I am rushed here. But if someone will create experimental version with memory leak it will be helpful, because building ecosystem it's little bit orthogonal work.

munrocket avatar Feb 06 '22 00:02 munrocket

In the meantime, the workaround is the pass an unsecure i32 pointer which is a lookup key in JS land.

Pyrolistical avatar Sep 16 '22 20:09 Pyrolistical

👍 and while undocumented anywhere as a common practice (AFAICT) this is the way a lot of things do it, regardless of if the host is JS or not. ex say it is a "context" object, there would be a context ID as i32, and the host makes sure this isn't actually mapped to memory, rather a lookup table. That way if some code manipulates it unsafely, they fail to crash anything.

It is still insecure in so far as someone can possibly guess another session's ID, if they are in the same module instance, but then again wasm modules are not safe for concurrent use and removing context (clearing the key and the memory) before adding one back to the pool can prevent leaks.

Take above as grain of salt because I don't work in wasm security, just things I noticed in how things work outside JS.

codefromthecrypt avatar Sep 17 '22 01:09 codefromthecrypt

I've got a few questions about this ticket:

  1. Is address space fully implemented? It's parsed and fed into LLVM as far as I can tell. Considering that #653 is still open, I'm uncertain if it is complete.
  2. Is someone already working on it?
  3. Does my general battle plan seem correct?
    1. Rename std.wasm.Valtype to NumericType.
    2. Create std.wasm.ValueType as a tagged union of std.wasm.RefType and the above (leaving the possibility for VectorType in the future).
    3. Replace all uses of Valtype with the above union.
    4. Add all Reference Instructions^1 to src/arch/wasm/Mir.zig.
    5. Add .host to std.builtin.AddressSpace.
    6. In src/arch/wasm/CodeGen.zig, convert *addrspace(.host) anyopaque to ValueType{.RefType = .externref} anywhere it might be found (function params, instructions).
  4. Where should I be looking in the stage1 compiler in order to make these changes?

(Please don't take this as a commitment to actually implement it. This isn't my day job, and my attention span for hobby work tends to be short.)

gcoakes avatar Oct 09 '22 01:10 gcoakes

I've got a few questions about this ticket:

  1. Is address space fully implemented? It's parsed and fed into LLVM as far as I can tell. Considering that more pointer metadata: address spaces #653 is still open, I'm uncertain if it is complete.

  2. Is someone already working on it?

  3. Does my general battle plan seem correct?

    1. Rename std.wasm.Valtype to NumericType.
    2. Create std.wasm.ValueType as a tagged union of std.wasm.RefType and the above (leaving the possibility for VectorType in the future).
    3. Replace all uses of Valtype with the above union.
    4. Add all Reference Instructions1 to src/arch/wasm/Mir.zig.
    5. Add .host to std.builtin.AddressSpace.
    6. In src/arch/wasm/CodeGen.zig, convert *addrspace(.host) anyopaque to ValueType{.RefType = .externref} anywhere it might be found (function params, instructions).
  4. Where should I be looking in the stage1 compiler in order to make these changes?

(Please don't take this as a commitment to actually implement it. This isn't my day job, and my attention span for hobby work tends to be short.)

Footnotes

  1. https://webassembly.github.io/spec/core/syntax/instructions.html#reference-instructions

Before answering your questions, I'd like to bring to your attention that no decision has been made yet with regard to the syntax or whether it's even possible to integrate the external reference feature into Zig at all. Such a decision isn't very straightforward as there are many cases to consider before this can be accepted. e.g. what should be the behavior when someone tries to @ptrCast such a type? Therefore, implementing this right now is not possible and is also the reason why the work hasn't been started yet. However, I'll still happily answer your questions:

  1. Address spaces are not fully implemented yet. However, some work has been done to support certain use cases.
  2. No; see my remark above.
  3. Your plan seems to target the native WebAssembly backend. This backend will only be used for debug mode in the future. It's also incomplete right now, which means it isn't being used outside of implementing the backend. Instead, this should probably be implemented in the LLVM backend first. The battleplan does have the basics correct for the native WebAssembly backend, but is still missing many edge cases such as updating other instructions to use is_null for example when the type is an External Reference.
  4. It is not worthwhile to implement this in the stage1 compiler. Stage2 is the new default, and stage1 will be removed in the future. Address spaces also aren't implemented at all within the stage1 compiler.

Luukdegram avatar Oct 10 '22 13:10 Luukdegram

Thank you for your response.

this should probably be implemented in the LLVM backend first

Took me an hour or so to realize this... I implemented a naive version of the native changes and was very confused until I noticed one little line where it switched to the LLVM backend.

I have some notes I've taken since my last comment that I don't want to go to waste. @Luukdegram, though I suspect I'm just telling you things you already know, I hope someone will find them useful:

  • externref support isn't even complete in LLVM. There's some stuff merged (some kind of llvm specific assembly language for wasm), but most is wrapped up in: https://reviews.llvm.org/D122215 It is mostly for clang support, but it also adds the wasm_externref address space.
  • externref doesn't strictly map to a pointer. If compiler support is to be added, it would likely need either a whole new primitive (clang appears to be doing this) or limitations on instructions used with specific address spaces. It's not possible to compare them or store them outside of the stack. i.e.: the following wat is invalid:
(module
	(func (export "eq_ref") (param externref externref) (result i32)
		local.get 0
		local.get 1
		eq
	)
)
  • Rust's wasm-bindgen tool has support, but actually does this by post processing the wasm file. Instead of the compiler natively understanding externref's, it instead just uses them at the "fringes" of the module.

gcoakes avatar Oct 10 '22 17:10 gcoakes

An RFC to support Reference Types in Clang was just published.

As well as an implementation.

jedisct1 avatar Nov 28 '22 19:11 jedisct1

This brings us closer to the possibility of implementing ref types:

Update our baseline and generic CPU models for WebAssembly #21818

Afirium avatar Mar 24 '25 11:03 Afirium

any timeline or update to this topic? :)

K0IN avatar Jun 06 '25 14:06 K0IN