js-binary icon indicating copy to clipboard operation
js-binary copied to clipboard

Suggested delimiter for use in streams

Open aaronpeterson opened this issue 7 years ago • 1 comments

What would be the recommendation for specifying a delimiter when writing one binary object to a file at a time for the purpose of later processing as a node.js stream?

Can I simply write the binary data fs.writeFile with 'binary' with a trailing require('os').EOL? The end goal would be to read in a stream and decode in a transform with something like maxogden/binary-split or myndzi/binary-split-streams2

aaronpeterson avatar Apr 19 '18 08:04 aaronpeterson

Hi @aaronpeterson ,

In general, there is no valid delimiter, since any arbitrary byte sequence can happen in the middle of some value. For example:

let Type = require('js-binary').Type,
    schema = new Type('Buffer')
schema.encode(Buffer.from('0123456789', 'hex')) // <Buffer 05 01 23 45 67 89>

If you want to write multiple values into a file, I can give you 3 valid approaches:

  1. Wrap each encoded buffer, appending its size, so that you can read them back. Example sync code:
function writeFrame(buffer) {
    let header = Buffer.alloc(4)
    header.writeUInt32BE(buffer.length, 0)
    fs.writeSync(fd, Buffer.concat([header, buffer])
}

function readFrame(fd) {
    let header = Buffer.alloc(4)
    fs.readSync(fd, header, 0, 4, null)
    let length = header.readUInt32BE(0)
    let buffer = Buffer.alloc(length)
    fs.readSync(fd, buffer, 0, length, null)
}
  1. Choose one byte (or byte sequence) as delimiter, but escape matches that may happen inside de encoded binary on write and then unescape them on read.

  2. Do not use any delimiter and rely on lower-level Type#read() function. This works because Type#read() only reads up to the end of one record, allowing you to call it multiple times to get multiple values

// Suppose `buffer` has potentially multiple (full) encoded values
// `schema` is an Type instance
let ReadState = require('js-binary').ReadState,
    rs = new ReadState(buffer)

while (!rs.hasEnded()) {
    console.log(schema.read(rs))
}

Hope it works for you!

sitegui avatar Apr 24 '18 02:04 sitegui