Sujee Maniyam
Sujee Maniyam
Yes, the error is quite obvious 🤣 my suspicion is its caused by a race condition between workers trying to cleanup downloaded artifacts. Adding: I see this consistently on Google...
related : #583
hash is based on file content - treating it as bunch of bytes. So yes, we do need to read the files. but no need to process them (no pdf2pq...
> @sujee how would you treat zip and tar files? For first-version, I plan to treat zip/tar files as ONE file. So if there are duplicate zip/tar files, dupes will...
https://github.com/sujee/data-prep-kit/commit/08024dc3b049ca69bf4ffa84352754867dbd3f79 makes required changes. Related : #585
I have made the necessary changes on my branch. Will submit a PR soon
> @sujee do you think the error is specific to ededup or does it occur for all ray transforms? thanks I think this is more RAY related. Probably need to...
here is a similar example : https://pypi.org/project/tf-nightly/ 
@santoshborse @shahrokhDaijavad will work on this soon.