Suggested delimiter for use in streams
What would be the recommendation for specifying a delimiter when writing one binary object to a file at a time for the purpose of later processing as a node.js stream?
Can I simply write the binary data fs.writeFile with 'binary' with a trailing require('os').EOL? The end goal would be to read in a stream and decode in a transform with something like maxogden/binary-split or myndzi/binary-split-streams2
Hi @aaronpeterson ,
In general, there is no valid delimiter, since any arbitrary byte sequence can happen in the middle of some value. For example:
let Type = require('js-binary').Type,
schema = new Type('Buffer')
schema.encode(Buffer.from('0123456789', 'hex')) // <Buffer 05 01 23 45 67 89>
If you want to write multiple values into a file, I can give you 3 valid approaches:
- Wrap each encoded buffer, appending its size, so that you can read them back. Example sync code:
function writeFrame(buffer) {
let header = Buffer.alloc(4)
header.writeUInt32BE(buffer.length, 0)
fs.writeSync(fd, Buffer.concat([header, buffer])
}
function readFrame(fd) {
let header = Buffer.alloc(4)
fs.readSync(fd, header, 0, 4, null)
let length = header.readUInt32BE(0)
let buffer = Buffer.alloc(length)
fs.readSync(fd, buffer, 0, length, null)
}
-
Choose one byte (or byte sequence) as delimiter, but escape matches that may happen inside de encoded binary on write and then unescape them on read.
-
Do not use any delimiter and rely on lower-level Type#read() function. This works because Type#read() only reads up to the end of one record, allowing you to call it multiple times to get multiple values
// Suppose `buffer` has potentially multiple (full) encoded values
// `schema` is an Type instance
let ReadState = require('js-binary').ReadState,
rs = new ReadState(buffer)
while (!rs.hasEnded()) {
console.log(schema.read(rs))
}
Hope it works for you!