component-model Proposal: Update WIT syntax/semantics for packages/dependencies/use and alter binary format for externnames

I've been talking with @lukewagner and @guybedford recently about #177 and how best to tackle that in components and WIT, and I'm opening this issue to propose changes to both WIT and the component model binary format to solve the issue at hand and a few neighboring concerns as well. As a heads up what I'm proposing here is a breaking change relative to the currently implemented and accepted WIT syntax, and it's an addition of the binary syntax (while adding some ignores of some preexisting binary syntax).

Motivation and Existing Problems

The main motivation of these changes is that if you have a world such as:

// today's syntax
world {
    import foo: self.foo
}

then any dependency of foo will be "inlined" into the world as well:

// today's syntax
world {
    import foo-dep1: self.foo-dep1
    import foo-dep2: self.foo-dep2
    import foo: self.foo
}

This interacts poorly with transitive dependencies where, for example the proxy world for WASI depends on a types interface in the wasi-http package so the world ends up having import types: ... which is quite a "bland" name without any context in it.

One way to fix this is to consider an "id" more often than a kebab-name. For example the "id" of the relevant interface is wasi:http/types which is a much more contextually relevant name. The problem with this, however, is that WIT doesn't have a great means by which to discover its own package name or namespace. For example when developing the wasi-http package itself you'd simply have:

// wit/types.wit

default interface types {
  // ...
}

So there's no context available to say wither wasi or http. This is sort of solve with the deps folder the wit-parser crate implements along the lines of:

// wit/deps/http/types.wit

default interface types {
  // ...
}

where here the parser at least knows the package name is http. Still, however, it doesn't know anything about wasi.

Another problem with the deps/* folder syntax is that packages are named after their directory names meaning that there's no way to have two packages of the same name. For example two packages from different registries or two packages from different versions of the same registry.

In general many of the above problems are solvable in isolation somewhat but I'm hoping to solve them all at once with updates to the WIT syntax, binary format, and translation to wasm. One final point worth addressing here, or at least planning to be addressed here, is at least somewhat introducing the concept of versioning for future use. This isn't intended to be comprehensive just yet, but should help set up some foundations hopefully.

Changes to WIT

There are a number of syntactical changes to WIT described here and this attempts to go through them in such an order to build up to a final picture at the end.

Package names in files

The first change I'd like to propose to WIT is a requirement that all *.wit files will start at the top with a package ... statement. For example you could write:

package my-package

This will solve the "what's the name of my package" problem from above by now it's explicitly declared inline. This is additionally where we can have other semantic information such as versioning. An example of wasi-http might be:

package wasi:[email protected]

// ...

Here wasi: is listed as a "namespace", the name of the package is http, and the version is specified as 0.1.

Valid package names will be:

kebab-name - I'm just doing something
[email protected] - I'm doing something slightly more serious
kebab-namespace:kebab-name - I'm developing for a registry
kebab-namespace:[email protected] - I'm very serious at this point (most WASI repos will probably look like this)

For now the version could probably be any semver (e.g. up to 1.0-alpha1+test or something like that.

The package ... header would be required at the top of all *.wit files that make up a package. Each package ... directive would also be required to match all the other one specified, including the version.

Package Organization

WIT today is structured as a Package has a set of Documents where each document is a file (typically *.wit). Each Document can have a set of interfaces and worlds, where at most one world and one interface can be flagged as default. I'm proposing to change this instead to:

A Package is a collection of interfaces and worlds.
Packages can consist of multiple *.wit files but the interfaces/worlds in a package are unioned together. This means you can't define interface foo in two files in the same package. Packages can always be represented with one file.
The default keyword goes away entirely.

These changes are intended to assist with being able to auto-assign a unique ID to any interface within a package. They will also affect use paths described below. When combined with the above change to package ... headers some examples of inferred IDs are:

// wit/types.wit

package wasi:http

interface types { // id = "wasi:http/types"
  // ...
}

// wit/proxy.wit

package wasi:http

world proxy { // id = "wasi:http/proxy"
  // ...
}

The "ID" is inferred to be the package name, without the version plus a slash, plus the name of the world or interface. The version of the package additionally becomes the version of the interface.

The rationale for removing default is more clear with the changes to use below, and otherwise all WIT files being merged together means that for all interface and world items there's a unique ID within the package.

Changes to `use`

Today the use statement looks like:

interface foo {
  use self.my-interface.{...}                 // import from same-file interface
  use pkg.other-document.my-interface.{...}   // import from other file, same package interface
  use my-dependency.their-interface.{...}     // foreign dependency
}

With the updates to package structure above these forms largely no longer make sense. Instead this proposes making IDs more first-class in WIT syntax. The use statement will still be somewhat similar grammatically where it will look like:

use-statement ::=  'use' interface '.' '{' names '}'

where the major change is how an interface is specified. To explain that here's a few examples. First is importing from another interface in the same file:

interface foo {
  // ..
}

interface bar {
  use foo.{...}
}

Here the interface is just the bare name foo as that's what's in scope. If foo where in separate file then it needs to be explicitly imported in the outer scope to be used.

// wit/foo.wit
interface foo {
  // ..
}

// wit/bar.wit
use foo

interface bar {
  use foo.{...}
}

Here a bare use foo at the top level pulls the interface foo into scope. Due to the name being foo, that means "find an interface named foo elsewhere in this package. In this situation it's found in wit/foo.wit.

Importing from a foreign dependency now happens through the ID of that dependency. For example, with a deps structure, it would look like:

// wit/deps/the-dependency/foo.wit
package the-dependency

interface foo {
  // ..
}

// wit/foo.wit
use the-dependency/foo

interface bar {
  use foo.{...}
}

Here the use the-dependency/foo statement will look for a package with the name the-dependency which is found within the deps folder in this case. Here the name the-dependency matches the package declaration. For wasi-http this would look like:

// wit/deps/http/types.wit
package wasi:[email protected]

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/types

interface bar {
  use types.{...}
}

Here it can be seen that wasi: namespacing syntax is allowed as well. Furthermore it can be seen that the name of the folder holding the wasi-http package, http, does not have to exactly match the package specifier wasi:[email protected], it can be named anything.

This leads us to an example where you can depend on two packages of the same name from different registries:

// wit/deps/http1/types.wit
package wasi:[email protected]

interface types {
  // ..
}

// wit/deps/http2/types.wit
package bytecodealliance:[email protected]

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/types as wasi-types
use bytecodealliance:http/types as ba-types

interface bar {
  use wasi-types.{...}
  use ba-types.{...}
}

The as syntax will be allowed to rename dependencies locally within a file's context to avoid name collisions. The inferred ID of the wasi-types and ba-types interfaces will still be unique as they're derived from unique package IDs.

Finally this can also be used to import multiple versions of the same package:

// wit/deps/http1/types.wit
package wasi:[email protected]

interface types {
  // ..
}

// wit/deps/http2/types.wit
package wasi:[email protected]

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/[email protected] as types1
use wasi:http/[email protected] as types2

interface bar {
  use types1.{...}
  use types2.{...}
}

The @version syntax is allowed in use statements to disambiguate which version is being imported if multiple exist. It's an error for there two be two candidates without a version specified, for example. If only one version is available it's inferred to be that version (as is the case for all above examples).

Finally, all of the above can be slightly more "sugary" as well. The top-level use can also be used inline within worlds and interfaces. For example this:

use wasi:http/types

interface foo {
  use types.{...}
}

is equivalent to:

interface foo {
  use wasi:http/types.{...}
}

This is provided when something is only needed in one location to avoid the top level use and assigning a name to it. Note, though, that this can get sort of sigil-heavy with use wasi:http/[email protected].{...} so it's not intended to be super widespread. That being said these two are also equivalent:

// wit/foo.wit
interface foo {
  // ..
}

// wit/bar.wit
use foo

interface bar {
  use foo.{...}
}

is equivalent to:

// wit/foo.wit
interface foo {
  // ..
}

// wit/bar.wit
interface bar {
  use foo.{...}
}

so the use foo statements to import package-sibling interfaces is not required.

Changes to `world`

Currently imports and exports to a world are of the form:

world foo {
    import kebab-name: self.the-interface
}

This suffers from the downside mentioned originally where each interface must be assigned a kebab-name. The new syntax for worlds will look like:

world-import ::= 'import' ( kebab-name ':' )? interface

This means that the kebab-name is now optional. It'll be required for unnamed interfaces such as import foo: interface { ... } but otherwise it's now possible to do:

// wit/proxy.wit
package wasi:http

use types

world proxy {
  import types
}

Here there's no kebab-name. This means that the "name" of the import will be the ID of the interface, which in this case is wasi:http/types (as imported from a sibling wit/types.wit. A kebab-name can be explicitly listed, but often won't be required any more.

For example a Fastly-specific world might look like:

// wit/world.wit
package fastly:compute

world app {
  import wasi:http/types
  import wasi:clocks/monotonic-clock
  import wasi:random/random
  // .. more WASI imports ..

  import fastly:http/upstream
  import fastly:http/caching

  // ...
}

Here no kebab-names are necessary and everything, including their transitive dependencies, will be assigned names based on IDs. Transitive dependencies will always be inferred to be named by their identifier to avoid clashes between transitive dependencies.

Changes to the binary format

Currently the binary format has:

externname  ::= n:<name> ea:<externattrs>
externattrs ::= 0x00
              | 0x01 url:<URL>

This proposes changing this to:

externname  ::= n:<name> 0x00                 => kebab-name n
              | n:<name> 0x01 url:<URL>       => kebab-name n (ignore the url)
              | n:<name> 0x02 v:<version>     => where n =~ /(kebab:)?kebab/kebab/

version ::= 0x00                              => no version
          | 0x01 semver:<string>              => version `semver`

Here the 0x00 production is still interpreted as "here's a kebab name for the thing". The 0x01 production is reinterpreted to ignore the URL for now (for lack of a better idea of what to do with it). This could perhaps get removed during a final 1.0 break. The 0x02 production is the ID-based form of import which refers to "I'm importing an interface". I know @lukewagner is also interested in perhaps an 0x03 form of import meaning "I'm importing a specific component implementation" with perhaps more options too, but I don't think that affects this proposal in particular.

A text format for these could perhaps look like:

(import "kebab-name" (...))                           ;; just a name
(import "my-package/types" (version) (...))           ;; no version
(import "wasi:htttp/types" (version "1.0.0") (...))   ;; everything

I think that's everything I wanted to cover for this, and I'm interested in hearing feedback from others using WIT who have thoughts on syntax, bikeshedding, the breakage involved here, etc.

May 03 '23 22:05 alexcrichton

Really nice job writing up this proposal! This all sounds pretty great to me.

One small question I had where I might just be missing the whole picture is: if it's optional to use use to refer to other interfaces in the same package, then it seems like it'll lead to irregular use and so maybe we should either make it mandatory or disallowed.

Secondly: a small nit just on the Wat syntax at the end: to disambiguate interface-imports from future implementation-imports (which will include versions but also other optional sub-fields like version ranges, content-hashes and (real) URLs), maybe the Wat syntax is:

(import (interface "wasi:http/types" (version "0.2")) (...))

so that we can unambiguously add (import (implementation ...) (...)) later and also easily distinguish indicate where to use the grammar of interface IDs vs just kebab-names. Does that make sense?

May 03 '23 23:05 lukewagner

This looks like a great step in the right direction to me. Feedback roughly in order:

The first change I'd like to propose to WIT is a requirement that all *.wit files will start at the top with a package ... statement.

If we have packages with multiple .wit files, is the expectation that every single file would have a package signifier at the top? My main concern with that would be redundancy. Bumping the version might mean changing the version in many different files.

I'd be interested to hear what you had in mind. Could we perhaps have just one WIT file that gets to represent the entire package or something similar? Perhaps a validation rule can just ensure only one of the WIT files has this package signifier?

Packages can consist of multiple *.wit files but the interfaces/worlds in a package are unioned together. This means you can't define interface foo in two files in the same package. Packages can always be represented with one file.

This seems like a nice simplification for now. Is the assumption then that duplicate names would just throw? Are all WIT files effectively globbed in the package directory and then deduped upfront, or are they only pulled in when use'd? In line with Luke's comment, it seems it should probably be one or the other?

Here wasi: is listed as a "namespace", the name of the package is http, and the version is specified as 0.1.

If semver is supported, I assume you mean semver + range shortened subsets then? That seems fine and well-defined to me, although a nit is that I wouldn't expect 1.0-alpha1+test to be valid in this context, or was that a typo for 1.0.0-alpha1+test?

Finally this can also be used to import multiple versions of the same package:

// wit/deps/http1/types.wit
package wasi:[email protected]

interface types {
  // ..
}

// wit/deps/http2/types.wit
package wasi:[email protected]

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/[email protected] as types1
use wasi:http/[email protected] as types2

interface bar {
  use types1.{...}
  use types2.{...}
}

When use applies with renaming, I assume the final output component have no traces left of types1 and types2 and instead just reference imports to wasi:http/[email protected] directly? That is, if another package then imports from the above package (say it had package foo at the top), then it would inherit imports to wasi:http/[email protected] and wasi:http/[email protected] right? And the names used for those inherited imports wouldn't matter from a conflict perspective?

From a transitive perspective the other issue here is that for a given interface, it can have multiple versioned constraints pointing to it from its importer component packages, so the final combined component would either need to have multiple imports for the different constraints (in case they have different resolutions) or it would need to have some kind of constraint intersecting (eg @1.2.3 and 1.2 union as @1.2.3). There's then in that case the issue of what if you wanted to customize the resolutions for each component differently than the intersection.

Overall, being able to go from an interface to its package, version constraints and internal ID is very much needed at this point and I like the overall shape of what is being proposed. It would be great work through the version management details in a little more detail with regards to how it remains flexible to the standard expected features of being able to control resolutions with lockfiles and custom resolutions while stills getting a good default story, avoiding the various transitive knots.

May 04 '23 01:05 guybedford

Seems like a great proposal. My suggestion is to add the dependency manager as part of that instead of having a separate configuration file (https://github.com/bytecodealliance/wit-deps), currently WIT adds complexity with another configuration type so I think it will be better to eliminate extra configuration files. For example the syntax can be:

use foo {
  local: ../foo.wit
}

Or:

use foo {
  git: https://github.com/example/foo,
  path: wit/foo.wit
}

To my option, having everything is one place is easier to read and understand.

May 04 '23 21:05 omersadika

then it seems like it'll lead to irregular use and so maybe we should either make it mandatory or disallowed.

A good point! One reason we may want to keep it around though is that if you have your own interface types and you're additionally importing wasi:http/types then somehow that'll need disambiguating. If use types was disallowed at the type then you're forced to rename wasi:http/types which instead seems like it should be subjectively up to the developer instead?

I do think it's reasonable to largely not mention use foo syntax for sibling interfaces and instead just document that it works, and then something in like reference documentation you can realize that's how it works and probably continue to ignore it from then on.

... and also easily distinguish indicate where to use the grammar of interface IDs vs just kebab-names. Does that make sense?

Makes sense to me!

If we have packages with multiple .wit files, is the expectation that every single file would have a package signifier at the top?

@guybedford and I talked about this and the conclusion was that to handle this wit-parser will require at least one *.wit file in a package to have a package ... declaration, but it won't require it on all of them. If multiple declarations are specified then they must all match. That way conventionally packages could, for example, have wit/main.wit which has a package ... at the top and nothing else needs it at the top.

This seems like a nice simplification for now. Is the assumption then that duplicate names would just throw?

Yes the expectation is that duplication is a parse-error problem. Parsing will always parse all the *.wit files and then error-out if there's a duplicate interface or world amongst them.

If semver is supported, I assume you mean semver + range shortened subsets then?

Guy and I talked a bit about this, but this is definitley a point where the proposal here is a bit shaky. The current intention is that the versions here are glossed over during parsing, perhaps with light validation but not an actually semver resolver/querier/etc. My hope is that everything to do with semver "for real" can be in a separate tool like warg and the parsing phase can be sort of a "dumb black box" which only does string comparisons of versions. I'm not sure how well this will play out though.

I assume the final output component have no traces left of types1 and types2 and instead just reference imports to wasi:http/[email protected] directly? ... And the names used for those inherited imports wouldn't matter from a conflict perspective?

Correct

My suggestion is to add the dependency manager as part of that instead of having a separate configuration file

To me at least personally I don't think that this is necessarily the way to go at this time. I think there's more nuance than on the surface of how this is handled. My hope is that WIT can be managed in a way similar to Cargo dependencies. The Rust compiler, for example, knows nothing of crate versions. The WIT syntax is sort of like the Rust compiler in that regard where it supports versioned things but it doesn't have its own intrinsic knowledge. Instead an external tool like Cargo is the one managing versions and everything like that (including version ranges, unifying within one semver-compatible range, etc). My hope is that warg serves this purpose for WIT and the wit-parser phase of the syntax doesn't have to be too intertwined.

Now that doesn't mean that this sort of information can't be written down in the source. If there's a defined way for wit-parser to find the result of resolution that seems fine to me too. In the absence of having this already designed, however, it's not something I'd like to take on designing at this point.

I talking with @guybedford a good deal yesterday about all this and these are some of the ideas and results we had:

Exports and `externname`

The current changes to the binary format I outlined above are not sufficient. At the very least instantiation's with arguments and alias would also need to be updated with an externname field. Before doing this, however, one thing Guy proposed was what if instead externname was only present on imports and with arguments? Chiefly for example exports would not use externname and instead would always use kebab-name.

To me this I think makes sense to do since IDs seem mostly relevant for imports and less so for exports. This is actually already codified with world elaboration where if an exported interface transitively relies on an interface that's not otherwise exported then it's implicitly imported. This means that IDs are necessary to avoid conflicts due to implicit imports, but exports can always be explicitly listed with kebab-names in worlds without fear of conflict (as no exports are ever injected).

One concern I would have, however, is the round-trippable-ness of WIT where if you have a world of the form:

world foo {
    export my-handler: wasi:http/handler
}

If at the binary level my-handler is the kebab-name how is wasi:http/handler inferred? An idea for this is something along the lines of:

(component
    (import (interface "wasi:http/handler") (type $t (instance ...)))
    
    (export "my-handler" (instance $my_handler) (instance (type $t)))
)

where here the ID wasi:http/handler is registered with a type import. With type ascription it's now asserted that the my-handler export has the expected type here.

So the rough idea here is along the lines of:

Use externname only in imports and with instantiation arguments
Require kebab-names for exports in WIT (and components)
Encode exported interfaces as imports-of-types which use type ascription.

One unknown I realize writing this up is:

world foo {
    import wasi:http/handler
    export foo: wasi:http/handler
}

That both needs to import the type for wasi:http/handler and an actual concrete instance, so I don't know how to disambiguate that.

Annotating the type of an instantiated world

One concern that came up with @guybedford as well is that there's no easy way to import an interface and then describe a world that exports that interface inherently. For example this world:

world foo {
    import wasi:http/handler
    export foo: wasi:http/handler
}

both imports and exports wasi:http/handler, so in theory this could be wired up to itself. The instance of a component implementing this world, however, does not itself satisfy the import. Instead this world describes an exported instance which satisfies the import. One hypothetical feature could be:

world foo {
    import wasi:http/handler
    implements wasi:http/handler
}

(or something like that) where an instantiation of a component implementing this world can be directly wired up to the import. We didn't have any particular concrete conclusions about this per se other than that it might be nice to have at some point (but @guybedford correct me here if I'm wrong)

May 05 '23 15:05 alexcrichton

Hey @alexcrichton , this is a great proposal and reasonably solves many problems that we are facing today with wit packages! Here is my feedback

Versions

I like the idea of a top-level declaration of package like package wasi:[email protected] but I think we should de-couple wit syntax from versions and put the semver of a wit-package outside of the .wit files. By removing versions from .wit files, we can enable - quick iteration of versions without modifying unchanged .wit files - and off-load the version management logic to external package manager

I like how rust source files don't care about their version or versions of the dependencies, because the knowledge has completely off-loaded to cargo. I wish there could be manifest files to a wit package similar to Cargo.toml and Cargo.lock that declares a wit package, its metadata, its deps and resolved deps.

Another problem with the deps/* folder syntax is that packages are named after their directory names meaning that there's no way to have two packages of the same name.

Adding a manifest file like Cargo.toml will be great to help to solve the above problem, as the manifest file can specify the same package twice with different names and different resolution logic.

WIT files in a package

the conclusion was that to handle this wit-parser will require at least one *.wit file in a package to have a package ... declaration, but it won't require it on all of them.

One of the consequences of not requiring every *.wit file in a package to declare package ... on the top of the file is that the concept of a package now ties to a filesystem directory. This is because, in a folder, if one file has a package declaration, but other files don't, wit-parser would treat the entire folder as a single package. Maybe this is what we want, but it often causes confusions. Does the wit package automatically includes all the wit files in the sub-directories as well? Can we have a sub-directory to be a different wit package?

My main concern with that would be redundancy. Bumping the version might mean changing the version in many different files.

If we remove the version from the package declaration, then the package name could stay relatively static. Maybe this could resolve the concern of redundancy by @guybedford ?

Interfaces and Worlds

I want to bring something that hasn't been implemented yet but relevent to this discussion. I proposed union of worlds a few months ago and I am working in-progress on implementing the include syntax for worlds.

This proposal extends the path syntax to not only bring in interfaces from files, wit package and deps, but also worlds. But this also introduces ambiguities. There are two problems

The proposed use syntax can be decalred on the top-level of the file to pull "intefaces" in scope, and allow interfaces and worlds to use a shorter path. So it is effectively functioning as a way to shorten the path when used on top-level. This leads to a quesiton of whether or not it can be used to pull "worlds" in scope. For example, Can a top-level declaration ofuse wasi:keyvalue/world:1.0.1 allow a world to include world?
Since non-top-level use only works with interfaces, not worlds, it may make it less clear of what an id of a wit package is referring to. For example, does wasi:http/[email protected] refer to a world or an interface?

May 05 '23 21:05 Mossaka

Thanks for the feedback!

With respect to versions-in-files, I don't disagree with you. One of the constraints @lukewagner and I were talking about though (sorry forgot to fit this in above) is that if you're presented with a binary component that has two same-ID interface imports that only differ by version then the WIT extraction process needs to present this information somehow. Basically if components allow for differing IDs purely by version then when WIT is extracted the WIT needs to reflect that. We couldn't think of a better idea than putting the version in the WIT file. That being said I very much agree with you that the version is sort of a higher-level concern than parsing. That's why I was trying to keep it as sort of a black-box which is only used for exact string matches at the parsing layer.

For package .. declarations, I'll admit that I don't have much experience with languages that use this pattern. The example I can think of is Go where files in the same directory all start with package foo (or at least so I think). The intention for WIT is that subdirectories are not traversed, ever. There's only one way to organize a package and it's files within the same directory. Two packages can't be in the same directory as well. Do you know of pitfalls with this sort of strategy though? (another possibility would be to require package ... at the top of all files but only require the version on one of them, or otherwise allow the version to appear on any of them)

As for interfaces and worlds I would expect that, yes, worlds could be imported with all the same syntax. Whether the resolved "thing" is an interface or world I figured would be a resolution-level (basically a parse error). Do you think that there's going to be a problem with this though?

May 08 '23 14:05 alexcrichton

Thanks for your reply. Most of what you said makes a ton of sense to me. I need to do more research with respect to the package .. declarations and will get back to you later. But to the third point -

Do you think that there's going to be a problem with this though?

One concern I have is that it introduces a slight inconsistency. For example, assume that include has implemented, now users can do

// wit/bar.wit
use foo

interface bar {
  ...
}

This gives users a sense of feeling that use as a keyword can be applied to both interfaces and worlds. This is true for top-level use statement, but it breaks when used inside of a world.

use another-world // ok

world bar {
  use another-world // error
}

Well, we know that the correct way to include a world to world bar is to use include. The main point is that the two uses, one from the top-level, and one within interfaces and worlds, are two different things that have different semantics, but we are calling the same name.

Altihough I am probably suggesting to seperate these two to two different keywords, I don't have a good name to replace the top-level use on top of my mind. import is a popular one, but it's reserved already.

Alternatively, since include proposal is still WIP, we can think about if it's feasible and make sense to just use use keyword to union worlds.

May 08 '23 18:05 Mossaka

If I understand right, with or without include this is still an issue? With this proposal you could use foo to import an interface at the top level and then within an interface you do use foo.{...}, or in other words a bare use foo is disallowed from within an interface or a world. This means that use, the same keyword, is acting differently in two locations?

One possibility is to scrap top-level use anyway as it's somewhat "just" sugar since the full forms are allowed within interfaces/worlds. I agree it's a bit odd to use the same keyword for different things.

May 08 '23 19:05 alexcrichton

Ok I talked some more with Guy and Luke today, and here's some conclusions we reached:

Strings import names

We decided that one rule of thumb we'll stick to for exports and imports is that there's a "canonical string" by which they can be referred to. This means that while we may add more structured forms or more structured strings there it will always be possible to have a string key.

This means that (with "..." and (alias export $C "..." don't need to change for example as the strings will still work there. What will change, however, is the import and export statements, namely:

(component
  (import "foo" (func)) ;; today's syntax
  (import (interface "wasi:foo/[email protected]") (instance ...)) ;; proposed syntax
  
  (export (interface "...") ...) ;; proposed syntax
)

Specifically just imports and exports will get the externname changes. Textually they'll be separated with (interface ...) and in the binary format we'll switch to using prefixed payloads:

externname  ::= 0x00 n:<name // kebab-name
              | 0x01 n:<name>  // interface

where the syntax for a string accepted for an interface is:

interface ::= ( namespace:<kebab-name> ':' )? package ( '@' version )?
package ::= pkg:<kebab-name> '/' interface:<kebab-name>
version ::= [1-9][0-9]* '.' [0-9]+

Kebab names and IDs in WIT

Given the above binary format, it's not actually possible to have both a kebab-name and an ID for an import or export. This means that this will be disallowed:

world foo {
    import wasi-http: wasi:http/[email protected] // ERROR: both a kebab-name and interface specified
}

however this will still work:

world foo {
    import my-interface: interface { /* ... */ }
}

basically kebab-names will only be used for locally defined things rather than interfaces. Sort of a "one-off developer mode". Automatic injection of dependencies will never insert kebab-names, only IDs.

Ok that was actually a shorter comment than I thought it was going to be! (correct me if I missed anything @lukewagner or @guybedford)

May 09 '23 22:05 alexcrichton

Thanks again for the clear write-up!

One small tweak I'd suggest is that, given that we're saying that "one-off developer mode" is meant to use kebab names, not <interface> names, and thus with <interface> names we're in a registry-ish setting, perhaps the namespace should be non-optional, at least until we have concrete use cases where we definitely don't want one.

The only other thing I'd like to clarify is that, since today name is defined (in the text and binary format) to be an annotated kebab-name:

the interface case of externname would contain an <interface>, not a <name>
with and alias would need to be generalized in-place from <name> to <string> (i.e., a plain UTF-8 string, just like <core:name>) to be able to match either <name> or <interface>

May 10 '23 00:05 lukewagner

Makes sense to me to require namespace for now yeah, we can always relax it later if need be. Otherwise yes you're right on <interface> vs <name> vs <string>, sorry I was a bit sloppy in my notation above :)

May 10 '23 14:05 alexcrichton

How does namespace play into the directory names under wit/deps. I can't seem to get anything to resolve if the top level wit file in wit refers to a wit file under deps.

is it wit/deps/package-name, wit/deps/namespace/package-name, wit/deps/namespace:package-name? Are namespaces other than wasi supported?

Mar 21 '25 18:03 djoyce-ts

Hey @djoyce-ts this might be a better question to ask on Zulip -- could you make a thread there and include the tools you're running/ what you're seeing? Sometimes the problem can be referring to a top level wit file (ex. wit/component.wit) instead of the folder wit/ when using certain bindgen/resolution-related tooling.

Mar 22 '25 02:03 vados-cosmonic

Proposal: Update WIT syntax/semantics for packages/dependencies/use and alter binary format for externnames

Motivation and Existing Problems

Changes to WIT

Package names in files

Package Organization

Changes to use

Changes to world

Changes to the binary format

Exports and externname

Annotating the type of an instantiated world

Versions

WIT files in a package

Interfaces and Worlds

Strings import names

Kebab names and IDs in WIT

Changes to `use`

Changes to `world`

Exports and `externname`