wit-bindgen Export names and the component model

As I am currently implementing a tool to create WebAssembly components based on interfaces defined in wit, I've run into an issue about how best to map interface functions to the functions exported by a wit-bindgen-generated core module.

For example, let's say I have this interface:

greet: function(name: string) -> string

If I use wit_bindgen_rust::export! on this interface definition, I'll get an export simply named greet for a function in the resulting core module.

As components may import or export multiple interfaces (i.e. instances) and those interfaces may also have the same symbolic name for a function, the component tooling needs to know how to go from function "greet" on "interface version 1.2.3" to the appropriate export on the core module to lift.

One option is to have wit-bindgen munge the export name in a format that conveys what interface the export is coming from and then the component tooling can look for the expected munged name in the core module to lift.

Another solution would be to stick some information in a custom section in the core module that the tooling respects (although this solution doesn't solve the conflict of exporting more than one interface with a function of the same name).

Similarly, for imports, we need a way to set the import's module name independent of what might be a legal identifier for the purpose of code generation (i.e. the tooling will likely want to capture a unique version string in the import's module name and that string shouldn't participate in the generated code).

Mar 10 '22 18:03 peterhuene

Related to this, Luke and I talked about having a concept of a "default" interface for components, where certain exported functions from the core module are lifted and then directly exported as component functions without being bundled together and synthesized as an instance for export (i.e. as an "interface").

I'm not quite sure how to surface that concept just yet in the component tooling, nor how we'll best want to handle imports/exports of multiple conflicting interfaces in wit-bindgen.

Mar 10 '22 18:03 peterhuene

Historically I've thought that a wasm module would export one *.wit interface and import many, where exports line up on-to-one with names in *.wit and the first import level is a named *.wit and the second import level if the function within a *.wit. I haven't thought too closely about the possible pitfalls with a scheme like this, but I think ideally we'd avoid having to smuggle and encode/decode information if we can.

Mar 10 '22 19:03 alexcrichton

For now I can have the component tool limit it to one exported interface and just lift each exported core function as a direct export from the component, but this feels, to me at least, limiting in terms of composability.

I think there will eventually be a strong desire to have wit-bindgen be able to create components that implement more than one interface.

Mar 10 '22 21:03 peterhuene

Ah yeah that makes sense. Would it be possible to do something like if the module implements two interfaces everything works out ok if the names don't collide and otherwise if they collide some sort of "link map" needs to be provided?

Mar 14 '22 14:03 alexcrichton

So I'm at the point in wit-component where I'd like to see us figure out how exporting multiple interfaces from a component is going to work for wit-bindgen.

For composability, I would imagine that it'd be quite common to implement interfaces defined by third-parties while also wanting to implement an interface (even multiple?) defined by the component's author. It would give the most flexibility at link-time if a single component can be wired up in different ways depending on the interfaces imported by other components.

Let me start by explaining about how wit-component is currently designed to see if that aligns with wit-bindgen's goals and, more generally, how we think this would best work in the component model proposal itself as it is quite flexible and there are multiple ways we can ultimately make this work.

wit-component accepts two command line options relating to exports: a singular --interface option which denotes the default exported interface of the component and a multi-value --export option which takes a name-interface pair.

The --interface option is what the component directly exports: every function on the default interface is exported as a function from the component. This means that if an importer of the component needs an implementation of this interface, it can be given an instance of the component itself to satisfy the import.

The --export options result in the component exporting instances (exported with the interface's name) that, in turn, export the interface's functions. This means that if an importer of the component needs an implementation of one of these interfaces, it can be given the instance exported from the component's instance to satisfy the import.

By having the component export instances for the interfaces, the component itself describes the mapping of the interface's set of high-level functions to the inner core module's low-level implementation of those functions. It also solves one potential problem regarding conflicting function names between exported interfaces: each top-level exported interface name is required to be unique and the interface's function names are a second-level export name so there isn't the possibility of conflicting function names between exported interfaces.

This scheme does have the drawback that there could be a conflict with a default interface's function name directly exported by the component and the name of one of the other exported interfaces, but in practice that's very unlikely to happen as I suspect we'll need exported interface names to be uniquely-qualified in a way that is incompatible with legal function identifiers in wit (e.g. an interface named foobar:1.2.3:c0264650ae85).

The other approach we can take is to "flat" export everything from the component: every function from every exported interface is directly exported from the component. Assuming there aren't any conflicting names, this approach has the advantage that importers of any interfaces exported by the component can be satisfied with the component's instance itself.

However, exporting interfaces with conflicting function names will most certainly happen, especially when implementing interfaces defined by third-parties. As export names must be unique, this necessitates a renaming of at least one of the conflicting functions. Renaming inherently complicates linking as linkers need to determine what exported functions come from what interfaces and there needs to be some other auxiliary information linkers can consume to figure out that exported function foo-renamed-because-of-a-conflict is really function foo from interface foobar:1.2.3:c0264650ae85 so that they can wire up an instance with the correct function exports to pass as an import to a consuming component.

I posit that regardless of the approach, we'll likely want wit-bindgen to consistently mangle the export names from the core module so that they are unique to prevent conflicts (e.g. it could be as simple as interface#function); right now it's not possible to work around the symbol foo is already defined error when using multiple export! macros with interfaces that contain a conflicting function name. The core module's export names are ultimately an implementation detail so we should be free to use whatever scheme we want between wit-bindgen and wit-component.

I'll stop here as this comment is already Tolstoyeqsue. Thoughts on the above or alternative approaches for consideration?

Mar 29 '22 19:03 peterhuene

Everything you mention makes sense to me, and I agree with the conclusion that the core wasm should do something simple like interface#function or w/e as exported function names.

More broadly wit-bindgen as-is I think is at a bit of a dead-end and needs to reorient itself to get onto the right trajectory to get integrated into wit-component. For example the "IR" of wit-bindgen right now is the *.wit file and that's what's used to genrate bindings in both the guest and the host but instead the real "IR" is the actual component itself (which is synthesizable but not always equivalent to a *.wit file). What I'm mostly getting at is that it's not 100% clear to me how wit-bindgen-rust would go precisely from where it is today to a component without some refactorings (not necessarily major ones, it just doesn't slot in cleanly).

I think wit-bindgen wants to be pretty closely related to wit-component, possibly even necessitating a renaming of wit-bindgen to wit-component for the various runtime libraries. At least for Rust authors I would expect that a macro invocation declares "I'm exporting this interface" and then your build tool automatically produces the correct thing. That then also works if there's multiple macro invocations so the one wasm binary exports multiple interfaces and the tool manages that appropriately (either via one flat export, multiple instance exports, however the tool is configured).

That's all a bit of a fancy way of saying that I don't think the precise output of wit-bindgen is really all that important. All that matters is the interface that users implement and the build tooling to actually generate the component. Everything in the middle about ABI details and whatnot I think we should consider ourselves free to change at any time for whatever purpose we deem necessary.

Mar 29 '22 19:03 alexcrichton

I think that this is solved nowadays with the concept of worlds which are soon-to-be-added to wit-bindgen. The high-level idea is that a world will describe all exports, both top-level and nested, and that will be used to project onto a core wasm module with a canonical naming scheme (probably just the #-delimited names as is implemented right now).

This isn't 100% implemented everywhere but progress is being made and the world has changed quite a bit in the meantime since when this was opened so I'm going to close this as done-enough and just pending more work on worlds.

Oct 25 '22 14:10 alexcrichton