component-model icon indicating copy to clipboard operation
component-model copied to clipboard

Recursive Types

Open theduke opened this issue 3 years ago • 13 comments

There was some discussion on recursive types in https://github.com/WebAssembly/interface-types/issues/137.

My takeaway was that recursion would be hard to specify with adapter functions.

With adapters punted to a post-MVP phase , would it be sensible to re-consider allowing recursion?

There were arguments for and against in the linked issue, but I'd like to point out again that having to fall back to resources to represent nested data structures is quite a big limitation and would require falling back to other serialization formats like JSON/msgpack/Protobuf/... quite often to get decent performance.

theduke avatar Jun 28 '22 09:06 theduke

Adapter functions were extra-extra hard, but, even just talking through the implementation of canonical adapters (especially with the coercive subtyping), I believe that recursive types would make the implementation a lot more complicated. I do actually expect that we'll want recursive types before too long for the reasons you're describing; it's just not yet clear that recursive types are necessary for viability in the MVP sense. That being said, I'm happy to keep this issue open and collect feedback over the course of the proposal's standardization; maybe it ends up needing to be a carefully-considered late addition.

lukewagner avatar Jun 28 '22 17:06 lukewagner

I'd like to point out again that having to fall back to resources to represent nested data structures is quite a big limitation and would require falling back to other serialization formats like JSON/msgpack/Protobuf/... quite often to get decent performance.

It's probably worth mentioning that there's no way to represent trees without recursive types, and trees are pretty much ubiquitous in programming. For example, imagine a DOM or AST.

My fear is that if we release the MVP without a solid plan for dealing with recursive types, it'll become much harder to implement them later on.

Michael-F-Bryan avatar Oct 26 '22 15:10 Michael-F-Bryan

You're that these datatypes are ubiquitous, but the two examples you gave sound like examples where we wouldn't want to pass the entire tree as an eagerly-copied value in a function parameter or result. For the DOM we'd probably want to use resource types/handles in the roughly same manner that the browser/JS do, and for an AST, it seems like you'd want various iteration/cursor/streaming/etc schemes, that depended on the data structure and intended operations.

lukewagner avatar Oct 28 '22 17:10 lukewagner

Are there any concerns around Denial of Service with passing recursive types across a boundary? It seems like you could spend a lot of time in the adapter (or even infinite if something malicious happens?) that you'd want to bound in some way.

esoterra avatar May 09 '23 14:05 esoterra

Great question! Because even though the abstract values produced by lifting a recursive type are acyclic (trees), the hazard is if the lifted core memory somehow encodes a cycle/iloop (which should trap of course; but the question is how that trap is specified to happen and implemented). I suppose one option is for the Canonical ABI to say that recursive-type values are laid out contiguously in linear memory (including lists involved in the recursion), but I can imagine several other alternatives, so this is something that would have to be thought about more.

Another hazard is if the implementation used stack-based recursion to implement the fused lift+lower and stack-overflowed; we'd need to either fix a maximum dept (probably bad idea) or mandate that implementations needed to use iteration (which is more complicated, but seems doable).

lukewagner avatar May 09 '23 23:05 lukewagner

The flattened representation of a recursive type also doesn't inherently have a fixed type or size unless the recursion is through a list.

e.g. in this simple tree type, which has a flattened size dependent on the depth of the tree (in Rust terms it's !Sized)

variant node {
   branch(branch-node),
   leaf(i32)
}

record branch-node {
   left: node,
   right: node
}

It seems like the Canonical ABI would need to choose where to insert pointer-indirection to make it flattenable. Either at the top making the whole thing flatten into a pointer into memory or at branch-node's left and right fields allowing more to be flattened.

esoterra avatar May 10 '23 14:05 esoterra

I've run into this issue of not supporting recursive types and reading through the comments above it is not clear to me if there is a way to work around this. If anyone has some links to examples/documentation where this is explained please let me know.

For example, we have the following type:

  variant value-pattern {                                                       
    null,                                                                       
    %string(string),                                                            
    integer(s64),                                                               
    decimal(float64),                                                           
    boolean(bool),                                                              
    %list(list<value-pattern>),                                              
    octets(list<u8>),                                                           
  }

This above is intended to mimic the following Rust enum:

  pub enum ValuePattern {                                                            
      Null,                                                                          
      String(Arc<str>),                                                              
      Integer(i64),                                                                  
      Decimal(f64),                                                                  
      Boolean(bool),                                                                 
      List(Vec<Arc<ValuePattern>>),                                                  
      Octets(Vec<u8>),                                                               
  } 

This is just one example of a number of places in our code base where we have similar constructs.

It would be really nice to have support for recursive types for the MVP. I realize that I don't have a complete picture of the difficulties of implementing this but I want to express our wish to have this supported.

danbev avatar May 22 '23 14:05 danbev

You can't represent your ValuePattern directly in WIT because there are no recursive types. There are workarounds using indices into a list as indirection which I've used for trees, and you can also serialize into bytes or strings, but neither solution is very ergonomic.

Here's an example of the index-indirection pattern:

variant value-item {                                                       
   null,
   %string(string),
   integer(s64),
   decimal(float64),
   boolean(bool),
   %list(list<u32>), // list of indices of values
   octets(list<u8>),
}

type value-tree = list<value-item>;

Additionally, while recursive types are useful for some things, in many cases where you could send a JSON-like recursive value across the component boundary the actual data isn't even recursively typed if modeled concretely and it can be represented using records and lists instead (e.g. it represents a list of customers as opposed to a parse tree). On the other hand, if your structure is very large, you may want to represent it as a resource type (which are arriving pre-MVP) anyway so you aren't copying lots of data between components frequently and instead pass a handle to it around.

So while this is something we need eventually, there are workarounds and alternate options available and it's very complicated & time consuming to implement for reasons like those mentioned in this issue, so it won't be available for quite some time.

esoterra avatar May 22 '23 15:05 esoterra

@Kylebrown9 Thank you for the explanation and the example! I'll take a shot at implementing it this way.

danbev avatar May 22 '23 16:05 danbev