Problems in funannotate predict with AUGUSTUS_CONFIG_PATH
Are you using the latest release? funannotate v1.8.13
Describe the bug Training Augustus using BUSCO gene models showed an error, which seems to be related with the AUGUSTUS_CONFIG_PATH
What command did you issue?
funannotate predict --species reticulata_predict
--input ./genome.fa
--out reticulata_predict
--transcript_evidence ./Trinity-GG.fasta
--rna_bam ./RNA_alignmentAligned.sortedByCoord.out.bam
--protein_evidence./proteinsfishes.fasta
--other_gff ./file.gff3
--busco_db actinopterygii
--organism other
--max_intronlen 10000
--busco_seed_species zebrafish
--optimize_augustus
--repeats2evm
--cpus 16
--AUGUSTUS_CONFIG_PATH=/path/d/bin/augustus_config_system
--GENEMARK_PATH=/path/bin/gmes_linux_64_mod
--EVM_HOME=/path/bin/EVidenceModeler-v2.1.0
Logfiles fun_predic.txt augustus.log
OS/Install Information
Checking dependencies for 1.8.13
You are running Python v 3.8.15. Now checking python packages... biopython: 1.80 goatools: 1.2.3 matplotlib: 3.4.3 natsort: 8.2.0 numpy: 1.24.1 pandas: 1.5.3 psutil: 5.9.4 requests: 2.28.2 scikit-learn: 1.2.1 scipy: 1.10.0 seaborn: 0.12.2 All 11 python packages installed You are running Perl v b'5.032001'. Now checking perl modules... Carp: 1.50 Clone: 0.46 DBD::SQLite: 1.72 DBD::mysql: 4.046 DBI: 1.643 DB_File: 1.855 Data::Dumper: 2.183 File::Basename: 2.85 File::Which: 1.24 Getopt::Long: 2.54 Hash::Merge: 0.302 JSON: 4.10 LWP::UserAgent: 6.67 Logger::Simple: 2.0 POSIX: 1.94 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.14 Tie::File: 1.06 URI::Escape: 5.12 YAML: 1.30 threads: 2.25 threads::shared: 1.61 ERROR: local::lib not installed, install with cpanm local::lib Checking Environmental Variables... $FUNANNOTATE_DB=/camp/home/safiand/home/users/safiand/funannotate_db $PASAHOME=/camp/home/safiand/home/users/safiand/.conda/envs/funannotate/opt/pasa-2.5.2 $TRINITY_HOME=/camp/home/safiand/home/users/safiand/.conda/envs/funannotate/opt/trinity-2.8.5 $EVM_HOME=/camp/home/safiand/home/users/safiand/.conda/envs/funannotate/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/camp/home/safiand/home/users/safiand/.conda/envs/funannotate/config/ $GENEMARK_PATH=/camp/home/safiand/home/users/safiand/bin/gmes_linux_64_mod
Checking external dependencies... PASA: 2.5.2 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.5.0 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v35 diamond: 2.0.15 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2021-08-25 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 17.0.3-internal kallisto: 0.46.1 mafft: v7.515 (2023/Jan/15) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.24-r1122 proteinortho: 6.1.7 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.16.1 snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.11 (Oct 2022) tantan: tantan 40 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39
I think this was fixed, try to update to latest release with pip in your current environment:
python -m pip install "funannotate==1.8.15" --upgrade --force --no-deps
Hi, After updating funannotate as you said, I still have same issue. "UnboundLocalError: local variable 'AUGUSTUS_BASE' referenced before assignment" . "new_species.pl --AUGUSTUS_CONFIG_PATH=/nemo/lab/cardoso-moreiam/home/users/safiand/genome_annotation/reticulata/funannotate/reticulata_predict/predict_misc/ab_initio_parameters/augustus/ --species=BUSCO_reticulata_predict_2779326087 Could not locate command line parameters file: /nemo/lab/cardoso-moreiam/home/users/safiand/genome_annotation/reticulata/funannotate/reticulata_predict/predict_misc/ab_initio_parameters/augustus/parameters/aug_cmdln_parameters.json.
etraining --species=BUSCO_reticulata_predict_2779326087 /nemo/lab/cardoso-moreiam/home/users/safiand/genome_annotation/reticulata/funannotate/reticulata_predict/predict_misc/busco/run_reticulata_predict/augustus_output/training_set_reticulata_predict.txt
augustus: ERROR /nemo/lab/cardoso-moreiam/home/users/safiand/genome_annotation/reticulata/funannotate/reticulata_predict/predict_misc/ab_initio_parameters/augustus/topCodonExcludedFromCDS=False/ is not a directory. Could not locate directory AUGUSTUS_CONFIG_PATH. " It is a shame because I really wanted to compare BRAKER results with the funannotate pipeline as BRAKER is giving me too short gene models (10kb in eukaryote)
So what is in your $AUGUSTUS_CONFIG_PATH? It is expecting standard augustus folder structure, ie:
$ ls -l $AUGUSTUS_CONFIG_PATH
total 0
drwxr-xr-x 9 jon staff 288B Jul 4 2022 cgp/
drwxr-xr-x 14 jon staff 448B Jul 4 2022 extrinsic/
drwxr-xr-x 27 jon staff 864B Jul 4 2022 model/
drwxr-xr-x 4 jon staff 128B Jul 4 2022 parameters/
drwxr-xr-x 5 jon staff 160B Jul 4 2022 profile/
drwxr-xr-x 171 jon staff 5.3K Sep 5 2022 species/
It then also expects that you haven't installed this somewhere else, ie $AUGUSTUS_BASE is referring to one directory up from $AUGUSTUS_CONFIG_PATH where the scripts folder is located.
$ ls -l $AUGUSTUS_CONFIG_PATH/../
total 312
-rw-r--r-- 1 jon staff 1.7K Jul 4 2022 Dockerfile
-rw-r--r--@ 1 jon staff 2.4K Jul 4 2022 Makefile
-rw-r--r-- 1 jon staff 4.0K Jul 4 2022 README.md
-rw-r--r-- 1 jon staff 2.2K Jul 4 2022 Singularity.def
drwxr-xr-x 13 jon staff 416B Jul 4 2022 auxprogs/
drwxr-xr-x 15 jon staff 480B Oct 18 23:11 bin/
-rw-r--r--@ 1 jon staff 2.9K Jul 4 2022 common.mk
drwxr-xr-x 9 jon staff 288B Sep 5 2022 config/
drwxr-xr-x 31 jon staff 992B Oct 18 23:08 docs/
-rw-r--r-- 1 jon staff 107K Jul 4 2022 doxygen.conf
drwxr-xr-x 15 jon staff 480B Jul 4 2022 examples/
drwxr-xr-x 55 jon staff 1.7K Oct 18 23:08 include/
drwxr-xr-x 33 jon staff 1.0K Jul 4 2022 mansrc/
-rw-r--r-- 1 jon staff 27K Jul 4 2022 retraining.html
drwxr-xr-x 113 jon staff 3.5K Oct 18 23:08 scripts/
drwxr-xr-x 115 jon staff 3.6K Oct 18 23:10 src/
drwxr-xr-x 5 jon staff 160B Jul 4 2022 tests/
The only bit that is important here is the scripts folder, as funannotate needs to be able to find some of those scripts if they are not installed in your $PATH. So you either need to ensure that the augustus scripts directory is in your $PATH, or setup your augustus install as recommended by augustus developers.
Also, EvidenceModeler v2.1.0 is not supported, so you'll need to downgrade that to v1.1.1. I just noticed a few days ago that the command line options have changed in v2.1.0.
Hi, Thanks a lot. By including the script fo Augustus in the PATH, funannotate predict has been running fine and my issue seems to be resolved as Augustus in at the moment being trained. I hope I can get longer gene models. Thanks again, Diego
Hi, I am loving this pipeline. It produces better gene models and a more complete annotation. Thanks!