webassemblyjs icon indicating copy to clipboard operation
webassemblyjs copied to clipboard

wasm-edit -- Valid edits can throw with default decode options.

Open TimHambourger opened this issue 7 years ago • 5 comments

Description

Because of the assertNotIdentifier calls in wasm-gen/lib/encoder/index.js, editing, say, a module export name can throw with the default decode options.

Steps to repro

Start with input WASM corresponding to the WAT

(module
  (func $add (param $lhs i32) (param $rhs i32) (result i32)
    get_local $lhs
    get_local $rhs
    i32.add)
  (export "add" (func $add))
)

(from MDN's examples).

Then edit the export name like

import { edit } from '@webassemblyjs/wasm-edit';

const bin; // ArrayBuffer containing .wasm contents

edit(bin, {
    ModuleExport(path) {
        path.node.name += '!!';
    }
});

This throws with "Unsupported node Identifier" due to the assertion at

function encodeModuleExport(n) {
  var out = [];
  assertNotIdentifierNode(n.descr.id);
  // etc.
}

If you instead decode with ignoreCustomNameSection: true then the edit succeeds:

import { decode } from '@webassemblyjs/wasm-parser';
import { editWithAST } from '@webassemblyjs/wasm-edit';

const bin; // ArrayBuffer containing .wasm contents

const ast = decode(bin, {
    ignoreCustomNameSection: true
});

editWithAST(ast, bin, {
    ModuleExport(path) {
        path.node.name += '!!';
    }
});

At the relevant assertion, n.node.descr now has type 'NumberLiteral' instead of 'Identifier'.

Expected behavior

That valid edits succeed with default decode options. The current behavior feels restrictive.

Other Cases?

This behavior with export names seems to repro reliably across input .wasm files. I also found a possible repro with import names, but that one only affected some of the files I tested and might be due to a bug with decode. I'll file that separately.

TimHambourger avatar Jul 03 '18 04:07 TimHambourger

Thanks for the detailed report. As you've no doubt worked out this issue is due to the presence of a custom name section within your wasm binary.

The binary encoded form of wasm doesn't retain function / local names, instead these are all referenced by index. When generating a wasm binary you can optionally supply this information in the form of a custom section. With wat2wasm this is done via the debug-names option:

wat2wasm test.wat -o test.wasm --debug-names

The default behaviour of @webassemblyjs/wasm-parser is to read this custom name section (if present) and restore the names in within the AST.

Unfortunately the @webassemblyjs/wasm-gen doesn't support the 'inverse' of this.

The wasm-gen module should probably be updated to do the following:

  1. If functions / locals within the AST are 'named', convert these to indices and write the wasm binary - this would solve your immediate problem.
  2. Optionally (although perhaps by default to mirror wasm-parse) write these to a custom name section

I might give this a go.

ColinEberhardt avatar Jul 03 '18 08:07 ColinEberhardt

Hi @TimHambourger, thanks for your bug reports (and the great test suite, do you mind if we use it?).

This issue is due to a design bug. Here's the definition of a ModuleExport: https://github.com/xtuc/webassemblyjs/blob/master/packages/ast/src/types/nodes.js#L289-L301, where name is the exported identifier and descr.id is the reference to the exported element.

  • wasm-gen asserts that descr.id is an index because that's mapping the wasm format.
  • wasm-parse can resolve names and will replace descr.id by the internal name.
  • optionally, for wast-printer, it's more readable if descr.id is an identifier.

Here are the fixes I have in mind:

  • wasm-gen: while it's not really its scope to resolve names we could use https://github.com/xtuc/webassemblyjs/tree/master/packages/helper-module-context which should be able to resolve them to their indices (i'm :-1: on this) (see https://github.com/xtuc/webassemblyjs/blob/master/packages/helper-module-context/src/index.js#L164).
  • wasm-parser: as we saw before, both (index and identifier) are important informations. I would suggest to simply add both to your AST. (i'm :+1: on this).

xtuc avatar Jul 04 '18 08:07 xtuc

wasm-parser: as we saw before, both (index and identifier) are important informations. I would suggest to simply add both to your AST. (i'm 👍 on this).

Just to clarify, it sounds like you are proposing that the transformation that uses the custom name section to update index-based references to named references should not be destructive?

i.e. remove Identifier from the union type here https://github.com/xtuc/webassemblyjs/blob/master/packages/ast/src/types/basic.js#L37 and instead make it an optional property on relevant nodes so that the index is retained?

That gets a 👍 from me!

ColinEberhardt avatar Jul 04 '18 18:07 ColinEberhardt

Yes, that's right, for example a ModuleExport would have an index and optionnally a name. We can then use the information we need depending on what we do.

xtuc avatar Jul 04 '18 19:07 xtuc

Just to clarify, it sounds like you are proposing that the transformation that uses the custom name section to update index-based references to named references should not be destructive? ... That gets a 👍 from me!

Agreed, this sounds like a great approach! @xtuc, yes, feel free to use any of my tests as you see fit.

TimHambourger avatar Jul 07 '18 05:07 TimHambourger