dataloader icon indicating copy to clipboard operation
dataloader copied to clipboard

Add fixture to cleanup dataloader after each test runs

Open oliverholworthy opened this issue 2 years ago • 1 comments

Adds a fixture to cleanup dataloader after each test runs. This ensures that if a test using the Merlin Dataloader only partially consumes a dataloader instance (and isn't using it as a context manager (with statement), or calling stop explicitly). Then we automatically clean it up by calling stop on it (it's ok if we call stop more than once).

Related:

  • https://github.com/NVIDIA-Merlin/NVTabular/pull/1852
  • https://github.com/NVIDIA-Merlin/NVTabular/pull/1848

Background

In the scenario where we have a partially consumed dataloader in a test. When combined with something like pytest-cov which scans subprocesses, this can result in the dataloader background thread hanging around and onto resources that conflict with subsequent tests. In NVTabular this resulted in subsequent tests of operators to take significantly longer than usual (>5 hours instead of ~20 mins)

oliverholworthy avatar Jun 28 '23 16:06 oliverholworthy