aioitertools icon indicating copy to clipboard operation
aioitertools copied to clipboard

Allow slightly unbalanced consumption in tee

Open dcsommer opened this issue 5 years ago • 1 comments

The implementation of aioitertools.itertools.tee currently has the caveat "all iterators are dependent on the first iterator." This caveat can cause periodic stalls in my pipeline where the consumers all consume at roughly the same speed, with small imbalances.

A small refactor of tee could remove this limitation if the producer was a separate scheduled task (instead of the first consumer). Would you be open to a pull request for a change along these lines?

dcsommer avatar Apr 05 '21 15:04 dcsommer

A PR would be great, but I would like to see a couple constraints:

  • Some form of configurable limit to the amount that any one iterator can get ahead of the rest (ie, a size limit on queues). Maybe something like tee(itr: ..., n: int = 2, *, limit: int = 100), where limit=0 would be unbounded.
  • The task reading items from the original source ensure that it yields the event loop frequently. Just doing await asyncio.sleep(0) after each item would be sufficient.

By default, this would still cause consumption to be limited by the slowest iterator, but no longer dependent on the first iterator specifically.

amyreese avatar Apr 06 '21 02:04 amyreese