Abhinav Tuli comments

Results 21 comments of


                                            Abhinav Tuli

[BUG] Multiple hub.experimental.dataloader dedlocks with num_workers > 0

Hey @hakanardo! Thanks for reporting this. I was able to reproduce the issue. Will get back to you with a fix soon.

[BUG] Multiple hub.experimental.dataloader dedlocks with num_workers > 0

Hey @hakanardo! This issue has been resolved in the latest release of deeplake==3.0.12. Closing this, for now, would be great if you could confirm on your end too.

Allow tensorflow dataset to fetch chunks

Hey @daniel-falk ! Thanks for reporting the issue. #1911 should fix it and we will include it in the next release.

Move pytorch shuffling out of main thread

Hey @hakanardo ! Thanks for your PR. I was looking into the benchmark file you sent but was unable to recreate your results. The speed seems to be almost the...

Move pytorch shuffling out of main thread

Thanks @hakanardo! You might want to use num_workers=0 for deeplake (we're spinning up threads for fetching and decompression, num_workers spins up processes separately for transformation and collate) as in the...

Move pytorch shuffling out of main thread

I think using `multiprocessing.set_start_method("spawn", force=True)` at the top should fix the issue of it getting stuck. There's an issue with forking Hub3Dataset object that we're aware of.

Move pytorch shuffling out of main thread

Hey @hakanardo! Thanks for the suggestion, I tried it but couldn't really get a performance boost while iterating. Do you think that the better performance here is due to pinning...

Move pytorch shuffling out of main thread

Got it. Yup, I was using imagenet. The thing is that the experimental dataloader is already spinning up threads in C++, to prefetch and decompress data in parallel with training...

[BUG] ImageNet loader returns weird S3GetError when using memory_cache_size=12000 and local_cache_size=120000

Hey @AntreasAntoniou, I'm looking into the issue currently but the behavior is indeed very weird. Could you try out other datasets, maybe different splits of imagenet, and let me know...

[BUG] "processed" transforms significantly slower than "threaded"

@McCrearyD I had experimented with the idea of ``` number of samples per shard = workers * 32 * (number of samples that fit in 16mb)``` but the thing was...