Proposed implementation of WASI libc filesystem on Windows
Motivation
We would like to have a working implementation of all the WASI libc filesystem functions on Windows. This is currently not possible due to the use of various POSIX file functions which are not available on Windows or are not fully POSIX compliant.
Requirements
The solution outlined in this proposal is not necessarily specific to POSIX filesystem functions (it could be equally applied to the POSIX time functions for example) but we will only concern ourselves with filesystem functions. The solution can be easily extended to other POSIX functions in the future.
Proposal
The core issue is that the WASI libc implementation assumes the underlying platform is a POSIX platform. In the codebase, there is already an existing method of wrapping platform-specific APIs: a ‘wrapper’ function is defined in an interface header file and then implemented on a per-platform basis. e.g. os_mutex_lock is declared in core/shared/platform/include/platform_api_vmcore.h and is implemented using platform-specific APIs for each platform under core/shared/platform (see Windows implementation and POSIX implementation). This approach has the advantage of avoiding the use of lots of #ifdef’s when we need to use any platform-specific APIs. The proposal here is to adopt the same approach for the POSIX filesystem functions: wrap them in platform-agnostic interface functions which are implemented on a per platform basis. This will mean the WASI libc implementation will no longer reference any POSIX file functions directly, but rather wrapper functions which will call the underlying POSIX function on POSIX platforms and call the equivalent Windows APIs on Windows. Obviously we are only interested in Windows here, but this solution applies to any non-POSIX compliant platform. The benefit of this solution is that it allows us to keep the majority of the WASI libc implementation the same since most of the code is platform-agnostic already.
This approach is very similar to that taken by WasmEdge; there are both Windows and Linux classes to wrap any platform-specific details: Windows, Linux. The implementation is in C++ but it will still be useful to use as a reference when implementing POSIX file functionality for Windows. (the Linux implementation mostly passes through parameters to the underlying POSIX functions as expected).
Implementation
Wrapper Function Naming
To make it easy to understand what a wrapper function does without digging into a specific implementation, we can name them according to their equivalent POSIX function but with an addition os_ prefix. e.g. preadv would become os_preadv.
Abstracting Platform Handles
Aside from relying on the existence of various POSIX functions/data structures, we also assume that a platform handle (a.k.a. file descriptor on POSIX) is of an int type here which will not be true in general. In order to support Windows and any other non-POSIX platforms, we will need to define this type on a per platform basis.
Instead of int, we can use a type os_handle , which would be typedef’ed to int on POSIX. On Windows, we would define it as follows:
typedef struct windows_handle {
HANDLE handle;
__wasi_fdflags_t fdflags;
} os_handle;
We need to store the fd flags separately from the raw Win HANDLE since there is no way of retrieving/storing these flags on a Windows HANDLE natively.
The POSIX wrapper functions would then accept/return this os_handle type where appropriate.
Wrapper Function Signature
We can keep the signature of each wrapper function the same as the equivalent POSIX function, with the exception of the following changes:
- Explicitly return a WASI error (
__wasi_errno_t) from each function rather than usingerrno.errnois not set by Windows APIs and while we could set it ourselves, it is not a very clean way to handle errors and its usage as far as possible should be restricted to the POSIX implementation. Where the original POSIX function returned a value (that is not already an error code), we can instead use an additional out parameter. Each platform will then be responsible for converting their native error code into the appropriate WASI error code. - Where the equivalent POSIX function would expect a data structure/type defined in the POSIX specification, we would instead use the corresponding WASI data structure, which is necessarily platform-agnostic. e.g.
-
ssize_t writev(int *fildes*, const struct iovec *iov, intiovcnt);would be wrapped in the following function:__wasi_errno_t os_writev(os_handle *fildes*, const struct __wasi_iovec_t**iov*, int *iovcnt*, ** ssize_t **bytes_written*);
-
This would include POSIX flags e.g. O_DIRECTORY, O_APPEND. For all the functions we need to implement, there are corresponding WASI flags e.g. __WASI_O_DIRECTORY , __WASI_FDFLAG_APPEND.
Wrapper Function Declaration
The most suitable place to declare these functions seems to be core/shared/platform/include/platform_api_extension.h.
Standard I/O Streams
In addition to wrapping the standard POSIX filesystem functions, we will need a way to obtain handles to stdin, stdout and stderr (if they are available at all on the platform). Currently, we default these to the values 0, 1 and 2 respectively which would only be correct on POSIX systems. Therefore, we can introduce an additional wrapper function
-
os_handle os_get_std_handle(os_std_handle std_handle);
together with constants to represent the stdin/stdout/stderr devices:
-
OS_STD_OUTPUT_HANDLE -
OS_STD_INPUT_HANDLE -
OS_STD_ERROR_HANDLE
defining them as appropriate on POSIX and Windows.
N.B. on POSIX, os_get_std_handle would just pass through the std handle without modification.
Directory Handles
We rely on the existence of POSIX directory handles (DIR*) here. Similar to Abstracting Platform Handles, we will need to define this directory handle type as os_directory_handle on a per-platform basis. On POSIX, it would be defined to DIR whereas on Windows, we will need to emulate DIR. We can use this port of dirent.h to Windows as a reference for the implementation of the specific structs and directory functions.
Wrapping fcntl
fcntl is a slightly special case since its behavior is very generic depending on the command provided. However, we only use it with the commands F_GETFL and F_SETFL to get/set flags on a file descriptor/handle. To avoid implementing a lot of unnecessary functionality, it would be simpler to replace its usage with two wrapper functions
-
__wasi_errno_t os_handle_get_fdflags(os_handle handle, __wasi_fdflags_t *out_flags); -
__wasi_errno_t os_handle_set_fdflags(os_handle handle, __wasi_fdflags_t flags);
Testing
wasi-testsuite already contains many tests for the available filesystem functions but there are some gaps due to not having migrated over all the tests from the separate runtimes. To fill the gaps, we can move over the missing test cases from the previous wasmtime tests. See WASI filesystem functions test status for the test status of each filesystem function in wasi-testsuite and wasmtime. fd_fdstat_set_rights is the only function which is not tested at all in either the wasmtime or wasi-suite tests; so we will need to write a test for this function separately and add it to wasi-testsuite.
Currently the WASI tests are not run on Windows in CI but work is in progress already to enable them.
Alternatives
Implementing the POSIX interface directly
Instead of wrapping the POSIX interface, we could implement it directly. This would mean almost no changes to the core business logic and would avoid some boilerplate in the POSIX implementation of the wrapper functions (which will mostly pass through various parameters without modification). However, there are a few disadvantages:
- It will add some boilerplate to non-POSIX platforms since we will need to implement POSIX types (e.g. iovec).
- We would need to care on platforms which support a subset of POSIX functionality (like Windows) to ensure some degree of compatibility between our own POSIX implementation and the POSIX functions provided by the native platform.
- We have less freedom to implement a simpler interface. Using wrapper functions allows us to simplify the POSIX interface where necessary since often we wouldn’t need the generality of the original function. e.g. see Wrapping fcntl. If we directly implement the POSIX interface, we could either
- a) implement everything properly according to the specification and add unnecessary code/complexity
- b) not implement it according to the specification and potentially cause confusion to developers when the function does not behave according to POSIX standards.
Instead of directly implementing all of the POSIX interface, we could also implement just [dirent.h](https://pubs.opengroup.org/onlinepubs/7908799/xsh/dirent.h.html) by itself. This is definitely possible as evidenced by this Windows port. However, it probably could not be a standalone implementation since [fdopendir](https://pubs.opengroup.org/onlinepubs/9699919799/functions/fdopendir.html) at least would need to interact with our internally defined platform handle types.
Abstracting platform-specific APIs at the WASI interface level
At the other end of the spectrum, we could abstract the use of some POSIX functions away at a higher level, by implementing some of the WASI functions themselves (declared here) per platform. The main advantage of this approach would be less complexity since it involves one less ‘layer’. For example, the WASI function fd_sync is itself a thin wrapper on top of [fsync](https://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html). The issue with this approach is that all of these WASI functions need to look up the host handle from the WASI fd as a minimum before invoking the POSIX function. Therefore we would need to:
- a) duplicate this business logic across platforms which is definitely not ideal since it is platform-agnostic anyway.
- b) somehow extract this piece of logic out of the WASI functions but it would involve a more significant/disruptive refactor of the codebase - if it is possible at all.
Tasks
- [x] Fill WASI testing gaps
- [x] Abstract POSIX functions/types from WASI libc implementation
- [x] Implement POSIX filesystem functions on Windows
Appendix
Currently used POSIX filesystem functions
WASI filesystem functions test status
| Function | WASI test suite status | wasmtime tests status | Action |
|---|---|---|---|
| fd_advise | E | E | D |
| fd_allocate | I | E | C |
| fd_close | I | I | C |
| fd_datasync | N | I | C |
| fd_fdstat_get | I | I | C |
| fd_fdstat_set_flags | E | E | D |
| fd_fdstat_set_rights | N | N | W |
| fd_filestat_get | E | E | D |
| fd_filestat_set_size | N | E | C |
| fd_filestat_set_times | N | E | C |
| fd_pread | N | E | C |
| fd_prestat_dir_name | I | I | C |
| fd_prestat_get | I | I | C |
| fd_pwrite | N | E | C |
| fd_read | I | E | C |
| fd_readdir | E | E | D |
| fd_renumber | N | E | C |
| fd_seek | E | E | C |
| fd_sync | N | I | C |
| fd_tell | N | E | C |
| fd_write | E | E | C |
| path_create_directory | I | I | C |
| path_filestat_get | I | E | C |
| path_filestat_set_times | N | E | C |
| path_link | N | E | C |
| path_open | I | E | C |
| path_readlink | N | E | C |
| path_remove_directory | I | I | C |
| path_rename | N | E | C |
| path_symlink | I | E | C |
| path_unlink_file | I | I | C |
Test status:
- I = indirectly tested i.e. function is used in other tests but there is no explicit test for that function
- E = explicitly tested, there are one or more tests dedicated to testing that function
- N = not tested at all
Action:
- C = copy relevant tests from wasmtime to WASI test suite
- W = write new test to fill testing gap
- D = testing is sufficient (whether indirect/explicit) so no action is required
The issue with this approach is that all of these WASI functions need to look up the host handle from the WASI fd as a minimum before invoking the POSIX function.
it's necessary for other approaches too, isn't it?
it's necessary for other approaches too, isn't it?
Yes, we always have to look up the host handle from the WASI fd, regardless of the approach. But if we implement the WASI functions for each platform, this logic would be duplicated for each platform since it is platform-agnostic.
All the work has been merged to main. The only WASI function that remains to be implemented on Windows is poll_oneoff.