`funannotate clean`: proper tmp files
I had issues with funannotate clean when running several genomes in parallel - both locally and when submitting to an HPC.
This PR fixes both local parallel execution and HPC submission for me; tested twice with 4 genomes in parallel.
Let me elaborate on the context and actual error messages a bit, my first comment is likely insufficient :)
I am calling funannotate from a snakemake pipeline, sometimes with 4-5 input genomes.
Snakemake supports both local (cores/CPUs) and HPC parallelization.
When trying to annotate several genomes at once with more than 1 CPU (or by submitting to HPC queue) I was very consistenly getting errors related to the minitmp temporary files, primarily:
-
file not found -
ValueError: too few values to unpack(probably caused by a freshly-truncated file from another process)
None of these errors ever happened when running locally with just 1 CPU/core.
This boils down to the minitmp temporary file being generated in the working directory; with multiple processes on a local machine, or on HPC with a shared/network filesystem this causes minitmp reads/writes from several processes, causing those errors above.
In principle it was enough to just add NamedTemporaryFile to minitmp (also because qFasta and rFasta are placed under a uuid-named tmpdir, while minitmp is not - so placing minitmp under tmpdir is an alternative solution), but while at it I've deiced to add NamedTemporaryFile also to the inputs.
Hope this better explains both the motivation and the fix.
Thanks. Yes this all sounds fine to me and I will try to test when I get some time. I recall another user having an issue where the temp files were previously saving in current working directory and on their system didn't have write permissions there - so as long as we keep the temp files in the output directory this should be fine -- it seems like you've done that at first glance.
I did not touch the tmpdir part(s), only made temporary filenames (especially minitmp) guaranteed-unique. At least on my system tmpdir seems to be created in the working directory by default (version 1.8.13).
As far as I can tell this will work as expected - should we merge it @nextgenusfs or do we try for additional tests?
Fine to merge.