swift-openapi-generator icon indicating copy to clipboard operation
swift-openapi-generator copied to clipboard

Keep index into buffer to avoid repeatedly scanning the buffer

Open simonjbeaumont opened this issue 2 years ago • 1 comments

(Creating this issue from discussion on PR: https://github.com/apple/swift-openapi-runtime/pull/91#discussion_r1442861113).

There are a number of state machines in the runtime library for transforming async sequences of bytes into async sequences of some other structure, e.g. JSON Lines, SSE, JSON Sequences.

Each of these is searching for a delimiter in the currently buffered data and emitting elements if/when it finds a delimiter.

If no delimiter is present in the buffer, it waits for more bytes. But each time it receives more bytes, it will search all buffered bytes for delimiter, including through the bytes that have already been searched.

In the event of any fragmentation of the elements in the byte stream this will be more work than necessary and, with pathological fragmentation, leads to quadratic algorithm that would otherwise be linear.

On the surface, this could be solved by maintaining an index /cursor in the state machine to keep track of which bytes have already been searched for the delimiter. But the devil will be in the detail.

simonjbeaumont avatar Jan 08 '24 11:01 simonjbeaumont

Would likely make sense to wrap the storage + cursor in a new struct type, and adopt it in these state machines.

czechboy0 avatar Jan 08 '24 14:01 czechboy0