fix: Respect flush_queue_size when calling Workers.flush
Summary
Currently Workers.flush doesn't respect flush_queue_size which can cause a large number of events to be sent to Amplitude when this method is called. This can then trigger errors such as 413 (Content Too Large). The change here updates the method to send the data in batches according to flush_queue_size in order to prevent this.
This modifies the return type of Workers.flush from Future to List[Future] which I believe only impacts the behavior in Timeline.flush but I could be wrong there.
Checklist
- [x] Does your PR title have the correct title format?
- Does your PR have a breaking change?: Hopefully not
Hi @dsaxton-1password , Flush is not intended to be called frequently. Normally, with buffer_consumer, events are automatically sent periodically based on flush_interval_millis and flush_queue_size. Do you have scenarios where you need to call flush frequently?
Hi @dsaxton-1password , Flush is not intended to be called frequently. Normally, with buffer_consumer, events are automatically sent periodically based on flush_interval_millis and flush_queue_size. Do you have scenarios where you need to call flush frequently?
Not frequently, but the way our event consumer logic works there is a need to periodically "checkpoint" our progress before continuing. To do this we explicitly call Client.flush to ensure all the events are processed before checkpointing.
However, since there is no guarantee that the total queue has a reasonable size when we do this (we could be ingesting events much faster than they have been flushed in the background) we can end up sending an exceedingly large payload that then fails. It's also worth noting that we're already setting both the max flush queue size and the flush interval when we initialize our client but the issue happens nonetheless.
The way I see it there shouldn't be any disadvantage to sending the data in batches when someone calls flush (especially if the batches are the same size as what normally gets sent in the background), but there may be a benefit.
Hi @dsaxton-1password Thank you for providing the additional information and this PR. Changing the API return value type would break compatibility, so I rewrote the fix based on your suggestion. Please help review and validate the PR in your service https://github.com/amplitude/Amplitude-Python/pull/64, thanks!