input folder deduplication
Sometimes faster to drop file again. So better to search if file with same hash is already in folder and do not copy it again.
If drop file from input folder - it copy it again. This is undesired at all.
You want us to try and do de-duplication via md5 checksums to prevent duplicates being loaded, and then load the duplicate instead if you try and load a duplicate?
yes. Because searching files in external viewer is faster, then in web UI
was this tested? if you are familiar with hashlib, it is both extremely slow, and this will sometimes this will hang indefinitely on Windows
was this tested? if you are familiar with hashlib, it is both extremely slow, and this will sometimes this will hang indefinitely on Windows
It's primarily because of the SHA256 and it being a 'slower' hash function. A simpler solution would be to use a SHA1 sum or an MD5 sum, but those are slower too.
Note that Windows hanging is not reproducible (I use multiple hashlib functions in some of my custom nodes) and normally the 'hanging' is going to be dependent on file size and CPU power/resources. And I say this also because of the hashlib components on my own nodes take up to 30 seconds to sha256 sum for a 12GB model file with 8 cores (12 threads), on a Windows environment.
was this tested? if you are familiar with hashlib, it is both extremely slow, and this will sometimes this will hang indefinitely on Windows
Yes, it doesn't hash every image, only the image you load if there is an existing file in the inputs folder with the same name.
If it were hashing a lot of files it might be worth it to go with something faster, it would be a fairly extreme edge case for it to hash even 10 files, and as its only done when you load an image, not on execution, there is not any practical benefit to using a faster function.
It can be swapped out for whatever hash function you want however.