deeplake icon indicating copy to clipboard operation
deeplake copied to clipboard

[BUG] dataloader only return the first element of the sequence

Open wjfwzzc opened this issue 3 years ago • 3 comments

🐛🐛 Bug Report

⚗️ Current Behavior

When I build a dataset with a sequence of images (like frames), everything works fine. But if I tranfer it to a pytorch dataloader, the loader can only return the first element of the sequence.

Input Code "Creating Datasets with Sequences" official colab example

Append following code to the end:

print(ds[0]["frames"].shape)
print(ds[1]["frames"].shape)
dataloader = ds.pytorch()
data_iter = iter(dataloader)
print(next(data_iter)["frames"].shape)
print(next(data_iter)["frames"].shape)

It returns:

(600, 1080, 1920, 3)
(1050, 1080, 1920, 3)
torch.Size([1, 1080, 1920, 3])
torch.Size([1, 1080, 1920, 3])

But I hope It returns something like:

(600, 1080, 1920, 3)
(1050, 1080, 1920, 3)
torch.Size([1, 600, 1080, 1920, 3])
torch.Size([1, 1050, 1080, 1920, 3])

wjfwzzc avatar Dec 07 '22 02:12 wjfwzzc

hi there, sorry you've run into this issue. We will look into this shortly!

mikayelh avatar Dec 07 '22 03:12 mikayelh

Hi @wjfwzzc! Thanks a lot for raising the issue. We are aware of it and currently working on a fix. Will update you as soon as it's fixed.

tatevikh avatar Dec 07 '22 03:12 tatevikh

Hey @wjfwzzc. Unfortunately, there's a fundamental issue with supporting sequences in the python implementation of our dataloader, so we've decided not to support them for now, and we will add appropriate error messages so you don't encounter the issue above.

Sequences is supported in the c++ dataloader (ds.dataloader() - details here), but this dataloader is only available if you use datasets hosted by Activeloop, or is you are on the Growth or Enterprise plan.

istranic avatar Dec 07 '22 12:12 istranic