pfischer-nvidia
pfischer-nvidia
We have our own downloading framework. But all we do really is `hashlib.sha256(image_bytes).hexdigest()` Let me check what img2dataset does and if we can make it match
Just checked. img2dataset [does the same thing](https://github.com/rom1504/img2dataset/blob/46089583c27132b8a345626d1797f2cdf3eb63cf/img2dataset/downloader.py#L318). I downloaded some images with img2dataset, and I'm getting the same hashes as with our code and they don't match the given metadata....
We are blocked by this issue @gabrielilharco, @rom1504 . Please help soon by checking how this was done.
We haven't downloaded CommonPool with hash checking. Can you please explain the thought behind your question? What is the conclusion if it's different / the same? Given the above example,...
@gabrielilharco: I think your finding (small pixel differences) does not fully explain the issue. Please see my original post above, where the image was the exact same image already back...
Sure, I can create a list. But for each sample there should be some explanation why the hash is different and I think we haven't found an explanation for the...
Ok I understand that images on the internet change over time (even only slightly). But my example above is **byte-wise exactly the same as in 2019** and yields the same...