Results 21 comments of Abhinav Tuli

Hey @hakanardo! Thanks for reporting this. I was able to reproduce the issue. Will get back to you with a fix soon.

Hey @hakanardo! This issue has been resolved in the latest release of deeplake==3.0.12. Closing this, for now, would be great if you could confirm on your end too.

Hey @daniel-falk ! Thanks for reporting the issue. #1911 should fix it and we will include it in the next release.

Hey @hakanardo ! Thanks for your PR. I was looking into the benchmark file you sent but was unable to recreate your results. The speed seems to be almost the...

Thanks @hakanardo! You might want to use num_workers=0 for deeplake (we're spinning up threads for fetching and decompression, num_workers spins up processes separately for transformation and collate) as in the...

I think using `multiprocessing.set_start_method("spawn", force=True)` at the top should fix the issue of it getting stuck. There's an issue with forking Hub3Dataset object that we're aware of.

Hey @hakanardo! Thanks for the suggestion, I tried it but couldn't really get a performance boost while iterating. Do you think that the better performance here is due to pinning...

Got it. Yup, I was using imagenet. The thing is that the experimental dataloader is already spinning up threads in C++, to prefetch and decompress data in parallel with training...

Hey @AntreasAntoniou, I'm looking into the issue currently but the behavior is indeed very weird. Could you try out other datasets, maybe different splits of imagenet, and let me know...

@McCrearyD I had experimented with the idea of ``` number of samples per shard = workers * 32 * (number of samples that fit in 16mb)``` but the thing was...