deeplake icon indicating copy to clipboard operation
deeplake copied to clipboard

[BUG] Multiple hub.experimental.dataloader dedlocks with num_workers > 0

Open hakanardo opened this issue 3 years ago • 2 comments

🐛🐛 Bug Report

If multiple hub.experimental.dataloader instances are used with num_workers > 0, things will hang.

⚗️ Current Behavior

The code below hangs when the emulated testing starts. If I set workers = 0 it works fine.

Input Code

  • REPL or Repo link if applicable:
import os
from time import time
from torchvision.transforms import ToTensor
from torchvision.transforms.functional import to_tensor
import hub
import numpy as np
from tqdm import tqdm
import hub.experimental

if not os.path.exists('/tmp/ds.tmp'):
    ds = hub.empty('/tmp/ds.tmp')
    with ds:
        ds.create_tensor('images', htype = 'image', sample_compression = 'jpeg')
        for _ in tqdm(range(10000)):
            ds.append(dict(images=np.random.uniform(0, 255, (224, 224, 3)).astype(np.uint8)))
else:
    ds = hub.load('/tmp/ds.tmp')

i = len(ds) // 10
test_ds = ds[:i]
train_ds = ds[i:]

# workers = 0
workers = 4

def transform(item):
    item['images'] = to_tensor(item['images'])
    return item

train_dataloader = hub.experimental.dataloader(train_ds)\
            .transform(transform)\
            .shuffle()\
            .batch(128, drop_last=True)\
            .pytorch(num_workers=workers)

test_dataloader = hub.experimental.dataloader(test_ds)\
            .transform(transform)\
            .batch(128)\
            .pytorch(num_workers=workers)

train_loss = test_loss = 0
for i, batch in enumerate(train_dataloader):
    train_loss += batch['images'].cuda().sum()
    print("Train", i)
    if i % 10 == 9:
        for i, batch in enumerate(test_dataloader):
            print("  Test", i)
            test_loss += batch['images'].cuda().sum()

print(train_loss, test_loss)

Expected behavior/code A clear and concise description of what you expected to happen (or code).

def function_right():
    # Here is the fix
    print("Ok)
    

⚙️ Environment

  • Python version(s):
    • good: [e.g. 3.8]
    • better: [3.8.6 - Clang 12.0.0 (clang-1200.0.32.27)]
  • OS: [e.g. Ubuntu 18.04, OSX 10.13.4, Windows 10]
  • IDE: [Vim, VS-Code, PyCharm]
  • Packages: [ Tensorflow==2.1.2 - latest]

🧰 Possible Solution (optional)

🖼 Additional context/Screenshots (optional)

Add any other context about the problem here. If applicable, add screenshots to help explain.

hakanardo avatar Sep 20 '22 08:09 hakanardo

hello @hakanardo , thank you so much for opening this issue - @AbhinavTuli will look into the bug right away! Thanks for making Hub better. :) (cc: @tatevikh @istranic)

mikayelh avatar Sep 20 '22 08:09 mikayelh

Hey @hakanardo! Thanks for reporting this. I was able to reproduce the issue. Will get back to you with a fix soon.

AbhinavTuli avatar Sep 21 '22 11:09 AbhinavTuli

Hey @hakanardo! This issue has been resolved in the latest release of deeplake==3.0.12. Closing this, for now, would be great if you could confirm on your end too.

AbhinavTuli avatar Oct 27 '22 07:10 AbhinavTuli