Sequence db size != result db size
Hi,
My users reported hitting this error on MMseqs2-17 on our HPC beegfs parallel storage. I worked with them to isolate a small and quick way to reproduce it consistently.
We run mmseqs easy-linclust input.faa ccl_clus tmp --min-seq-id 1.0 -c 1.0 --cov-mode 0 --threads 1 -v 3 --remove-tmp-files 1 and this works as expected. As soon as we increase --threads to 2 we hit db size inconsistency.
I did some digging in the code and suspect that issue is timing related and comes from OpenMP scheduling in Util::ompCountLines. I see similar case was already worked around in issue #210. Can you comment on this and suggest a workaround or patch?
That is a very surprising error. Could you upload the input set so I can try to reproduce this locally?
Does this only happen with tmp on beegfs?
So far yes, we only hit this on beegfs.
We also have another scenario hitting this error from foldseek but I don't know what version of MMseqs2 is embedded into that binary.
Could you please upload the full terminal output of one of the runs that crashes with this issue?