node-capnp icon indicating copy to clipboard operation
node-capnp copied to clipboard

How to parse multiple objects

Open lmammino opened this issue 8 years ago • 5 comments

Is it possible to parse a buffer that contains multiple objects?

After a 30-minutes investigation, I couldn't figure out a way to do it.

lmammino avatar Jan 19 '18 22:01 lmammino

I found a solution eventually. Not sure it's the ideal one, but it seems to be working.

The idea is that I keep iterating over the data buffer until:

buffer.length > capnp.expectedSizeFromPrefix(buffer)

For every iteration, the buffer is slice with the expectedSizeFromPrefix size removed from to beginning.

This is a streaming implementation of this approach.

const { Transform } = require('stream')
const capnp = require('capnp')

class CapnpReaderStream extends Transform {
  constructor (schema, options = {}) {
    options.objectMode = true
    super(options)
    this.schema = schema
    this.buffer = null

  }

  _transform (chunk, encoding, callback) {
    if (!this.buffer) {
      this.buffer = chunk
    } else {
      this.buffer = Buffer.concat([this.buffer, chunk])
    }

    let expectedSize = capnp.expectedSizeFromPrefix(this.buffer)
    let data = null
    while (this.buffer.length >= expectedSize) {
      data = capnp.parse(this.schema, this.buffer)
      this.buffer = this.buffer.slice(expectedSize)
      expectedSize = capnp.expectedSizeFromPrefix(this.buffer)
      this.push(data)
    }

    callback()
  }
}

module.exports = CapnpReaderStream

Please @kentonv, let me know if this approach makes sense of if there are better approaches.

lmammino avatar Jan 23 '18 05:01 lmammino

Turned this into an npm module https://github.com/lmammino/capnp-stream/

I will probably add a writable equivalent

Looking forward for @kentonv feedback

lmammino avatar Jan 23 '18 17:01 lmammino

Hmm, I guess that's probably the best way to do it given the current interface.

In C++ you would read a stream by constructing a new InputStreamMessageReader or StreamFdMessageReader for each message. Or, FlatArrayMessageReader has a method to get the rest of the buffer after the message end. But these aren't exposed in node-capnp as far as I recall.

I'd be happy to accept a PR implementing a new variant of parse() that also returns the endpoint of the message, or returns the slice of the buffer after the message end.

kentonv avatar Jan 24 '18 00:01 kentonv

Thanks, @kentonv. This approach doesn't seem to work when using the packed option. The expectedSizeFromPrefix seems to hang forever. Any advice on how to solve this?

Regarding the PR, unfortunately, I don't have experience in porting c/c++ modules to Node.js

lmammino avatar Jan 24 '18 01:01 lmammino

expectedSizeFromPrefix should never hang -- that seems like a bug. However, it definitely won't work on packed messages. Making a version of this that works for packed messages would require new C++ code that unfortunately will be a bit hairy to implement. Feel free to file an issue but I'm not sure when I'll be able to work on it... :/

kentonv avatar Jan 24 '18 01:01 kentonv