druid icon indicating copy to clipboard operation
druid copied to clipboard

Make tempStorageDirectory configuration optional and rely on task dir instead

Open adarshsanjeev opened this issue 1 year ago • 2 comments

Currently, durable storage and export both require configuring a temporary directory to be used using druid.export.storage.<connectorType>.tempLocalDir and druid.msq.intermediate.storage.tempDir.

Tasks on middle manager already have a configured temporary directory. This PR aims to reduce the configuration required by using the task directory as a default if it is not explicitly configured, thus reducing the number of configs that a user has to set.

Please note that preference would be given to the user configured, druid.*.storage.temp*Dir, on the tasks. If that is not configured, we then use the configured temporary directory.

Overlord and brokers also require storage connector configurations (for the durableStorageCleanerOverlordDuty and to fetch results of async queries respectively), but do not have a default temporary task directory. The configuration is still required for these services.


Release notes

druid.export.storage.<google/s3>.tempLocalDir and druid.msq.intermediate.storage.tempDir are not required configurations. If not configured, the task defaults to using the task temp directory.


This PR has:

  • [ ] been self-reviewed.
    • [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
  • [ ] added documentation for new or modified features or behaviors.
  • [ ] a release note entry in the PR description.
  • [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • [ ] added or updated version, license, or notice information in licenses.yaml
  • [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • [ ] added integration tests.
  • [ ] been tested in a test Druid cluster.

adarshsanjeev avatar Sep 06 '24 04:09 adarshsanjeev

Overlord and brokers also require storage connector configurations (for the durabelStorageCleanerOverlordDuty and to fetch results of async queries respectively), but do not have a default temporary task directory

Should Druid rely on java.io.tmpDir in that case? FileUtils has multiple options to create temp directories. InputSourceSampler in overlord uses one of those methods to create a temp directory.

LakshSingla avatar Sep 10 '24 10:09 LakshSingla

@adarshsanjeev I updated the description. Please check.

cryptoe avatar Oct 16 '24 05:10 cryptoe

@cryptoe could you please take a look at the changes I made now to MSQDurableStorageModule?

adarshsanjeev avatar Oct 28 '24 05:10 adarshsanjeev