funannotate icon indicating copy to clipboard operation
funannotate copied to clipboard

snakemake+singularity workflow problem

Open Xueliang24 opened this issue 3 years ago • 7 comments

Are you using the latest release? I am using the singularity lastest sif of funannotate

Describe the bug I want to use the snakemake+singularity workflow to predict fungal gene informations. So I wrote a script to run it. The script contained clean, sort, masked and predict. I have run it step by step to tested every step. As long as I run the part of predict (or including predict), script running always got stuck at Building DAG of jobs... . The script test.py is following:

 SAMPLES, = glob_wildcards("data/{sample}.fasta")
  
rule all:
     input:
         expand("result/{sample}/{sample}_predict.log", sample=SAMPLES)


rule clean:
   input:
         "data/{sample}.fasta"
   output:
         "result/{sample}/{sample}_clean.fasta"
   log:
         "result/{sample}/{sample}_clean.log"
     singularity:
         "/data/hanzg/04.funannotate/funannotate.sif"
     shell:
         """
         funannotate clean -i {input} -o {output} >{log} 2>&1
         """
rule sort:
   input:
       "result/{sample}/{sample}_clean.fasta"
   output:
        "result/{sample}/{sample}_sort.fasta"
   log:
         "result/{sample}/{sample}_sort.log"
   singularity:
        "/data/hanzg/04.funannotate/funannotate.sif"
   shell:
        """
        funannotate sort -i {input} -o {output} >{log} 2>&1
        """
 
rule mask:
    input:
         "result/{sample}/{sample}_sort.fasta"
    output:
         "result/{sample}/{sample}_masked.fasta"
    log:
         "result/{sample}/{sample}_masked.log"
    singularity:
        "/data/hanzg/04.funannotate/funannotate.sif"
    shell:
        """
        funannotate mask -i {input} -o {output} >{log} 2>&1
        """
rule predict:
    input:
         "result/{sample}/{sample}_masked.fasta"
    output:
         directory("result/{sample}/")
    log:
         "result/{sample}/{sample}_predict.log"
    singularity:
        "/data/hanzg/04.funannotate/funannotate.sif"
    shell:
        """
        funannotate predict -i {input} -o {output} -s {sample} --name {sample} --optimize_augustus --cpus 20 > {log} 2>&1
        """

What command did you issue? snakemake -s ./test.py --use-singularity --singularity-args " --bind /data/hanzg:/data/hanzg " --cores 20

Logfiles no log files because script running always got stuck at Building DAG of jobs... .

OS/Install Information

Checking dependencies for 1.8.12

You are running Python v 3.8.13. Now checking python packages... biopython: 1.79 goatools: 1.2.3 matplotlib: 3.5.2 natsort: 8.1.0 numpy: 1.22.4 pandas: 1.4.3 psutil: 5.9.1 requests: 2.28.1 scikit-learn: 1.1.1 scipy: 1.5.3 seaborn: 0.11.2 All 11 python packages installed

You are running Perl v b'5.026002'. Now checking perl modules... Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.39 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 local::lib: 2.000024 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed

Checking Environmental Variables... $FUNANNOTATE_DB=/opt/databases $PASAHOME=/venv/opt/pasa-2.4.1 $TRINITYHOME=/venv/opt/trinity-2.8.5 $EVM_HOME=/venv/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/usr/share/augustus/config $GENEMARK_PATH=/venv/opt/gmes_petap All 6 environmental variables are set

Checking external dependencies... Traceback (most recent call last): File "/venv/bin/emapper.py", line 694, in args = parse_args(parser) File "/venv/bin/emapper.py", line 509, in parse_args set_data_path(os.environ["EGGNOG_DATA_DIR"]) File "/venv/opt/eggnog-mapper/eggnogmapper/common.py", line 77, in set_data_path DATA_PATH = existing_dir(data_path) File "/venv/opt/eggnog-mapper/eggnogmapper/common.py", line 323, in existing_dir raise TypeError('not a valid directory "%s"' %dname) TypeError: not a valid directory "/opt/eggnog-mapper-data" PASA: 2.4.1 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.3.2 bamtools: bamtools 2.5.2 bedtools: bedtools v2.30.0 blat: BLAT v36 diamond: 2.0.15 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2017-11-15 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 11.0.9.1-internal kallisto: 0.46.1 mafft: v7.505 (2022/Apr/10) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.24-r1122 pigz: pigz 2.6 proteinortho: 6.0.16 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.15 signalp: 5.0b snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.9 (July 2021) tantan: tantan 39 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: emapper.py not installed ERROR: gmes_petap.pl not installed

Xueliang24 avatar Aug 21 '22 02:08 Xueliang24

@reslp Could you also help me to solve it

Xueliang24 avatar Aug 21 '22 02:08 Xueliang24

Sounds like eggnog mapper isn’t installed in this image? Also genemark but that shouldn’t be fatal.

hyphaltip avatar Aug 21 '22 19:08 hyphaltip

Which funannotate image are you using @Xueliang24? The one I created does not include eggnog mapper or GeneMark so @hyphaltip is correct. However I don't think that this would cause snakemake to hang at building the DAG. I think this is a snakemake rather than a funannotate issue.

reslp avatar Aug 22 '22 06:08 reslp

Which funannotate image are you using @Xueliang24? The one I created does not include eggnog mapper or GeneMark so @hyphaltip is correct. However I don't think that this would cause snakemake to hang at building the DAG. I think this is a snakemake rather than a funannotate issue.

I also think it is a snakemake problem, but I could not find the solved way.

Xueliang24 avatar Aug 22 '22 07:08 Xueliang24

Sounds like eggnog mapper isn’t installed in this image? Also genemark but that shouldn’t be fatal.

They don't affect the running of funannotate clean, sort, mask and predict.

Xueliang24 avatar Aug 22 '22 07:08 Xueliang24

@Xueliang24 , maybe try adding some more specific output to your predict rule (because that output folder will also be created by other rules); that specific output you should then expand upon in the all rule. Right now you expect/expand on a log file, which is only listed as a log of the predict rule, and not the output:

rule predict:
...
    output:
         directory("result/{sample}/")
    log:
         "result/{sample}/{sample}_predict.log"
...

And I have a question: how did you build the singularity image? Did you use funannotate docker image, or built it yourself in some other way?

spock avatar Aug 24 '22 15:08 spock

@Xueliang24 , maybe try adding some more specific output to your predict rule (because that output folder will also be created by other rules); that specific output you should then expand upon in the all rule. Right now you expect/expand on a log file, which is only listed as a log of the predict rule, and not the output:

rule predict:
...
    output:
         directory("result/{sample}/")
    log:
         "result/{sample}/{sample}_predict.log"
...

And I have a question: how did you build the singularity image? Did you use funannotate docker image, or built it yourself in some other way?

I pull the latest singularity image from the author. After adding the singlp and steup the database, I built it again. So it was based on the latest singularity.

About specific 'output', could you describe in more detail, or give an example? Because other rules focused on outputting files not folder, that folder is just a path.

Xueliang24 avatar Aug 27 '22 07:08 Xueliang24