databricks-sql-python icon indicating copy to clipboard operation
databricks-sql-python copied to clipboard

Download Manager: Stop shutdown in case of empty download tasks Queue

Open varun-edachali-dbx opened this issue 6 months ago • 0 comments

What type of PR is this?

  • [x] Feature

Description

Currently, get_next_downloaded_file assumes that if the download queue is empty, then we're done with the manager: it shuts down and returns None. But, this is Thrift specific behaviour. In SEA, we get links in batches and just because the manager is empty now does not necessarily mean we're done downloading all of the links we require.

Thus, we change get_next_downloaded_file to block until we can actually acquire a file. It must now only be invoked when we know that there is a file to acquire. In the old Thrift CloudFetchQueue, an additional call was made to this method when we were done fetching results during next_n_rows - this has been removed to align with the above. The type of the arrow table maintained by the Queue is now strictly pyarrow.Table instead of optionally being None.

How is this tested?

  • [x] Unit tests
  • [ ] E2E Tests
  • [ ] Manually
  • [ ] N/A

Related Tickets & Documents

N/A

varun-edachali-dbx avatar Jul 21 '25 04:07 varun-edachali-dbx