funannotate icon indicating copy to clipboard operation
funannotate copied to clipboard

funannotate setup error

Open WJunHao opened this issue 3 years ago • 5 comments

Good day! When I was trying to download the database for funannotate, I got an error. The command is funannotate setup -d Fanno_db/ --install all. The logfile as follow: [Feb 25 08:03 PM]: OS: CentOS Linux 7, 24 cores, ~ 197 GB RAM. Python: 3.8.12 [Feb 25 08:03 PM]: Running 1.8.9 [Feb 25 08:03 PM]: Database location: Fanno_db/ [Feb 25 08:03 PM]: Retrieving download links from GitHub Repo [Feb 25 08:03 PM]: Unable to download links from GitHub, using funannotate version specific links [Feb 25 08:03 PM]: Parsing Augustus pre-trained species and porting to funannotate [Feb 25 08:03 PM]: Downloading Merops database [Feb 25 08:03 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/merops/current_release/merops_scan.lib Bytes: 1957603 [Feb 25 08:03 PM]: Building diamond database [Feb 25 08:03 PM]: MEROPS Database: version=12.0 date=2017-10-04 records=5,009 [Feb 25 08:03 PM]: Downloading UniProtKB/SwissProt database [Feb 25 08:03 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz Bytes: 90527596 [Feb 25 08:04 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/reldate.txt Bytes: 151 [Feb 25 08:04 PM]: Building diamond database [Feb 25 08:04 PM]: UniProtKB Database: version=2021_04 date=2021-11-17 records=565,928 [Feb 25 08:04 PM]: Downloading dbCAN database [Feb 25 08:04 PM]: Downloading: http://bcb.unl.edu/dbCAN2/download/Databases/dbCAN-HMMdb-V8.txt Bytes: 94320446 [Feb 25 08:30 PM]: Downloading: http://bcb.unl.edu/dbCAN2/download/Databases/CAZyDB.07312019.fam-activities.txt Bytes: 63489 [Feb 25 08:30 PM]: Downloading: http://bcb.unl.edu/dbCAN2/download/Databases/dbCAN-old@UGA/readme.txt Bytes: 8098 [Feb 25 08:30 PM]: Creating dbCAN HMM database [Feb 25 08:30 PM]: dbCAN Database: version=10.0 date=2021-10-03 records=641 [Feb 25 08:30 PM]: Downloading Pfam database [Feb 25 08:30 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.gz Bytes: 293000230 [Feb 25 08:31 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.clans.tsv.gz Bytes: 358156 [Feb 25 08:31 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam.version.gz Bytes: 114 [Feb 25 08:31 PM]: Creating Pfam HMM database [Feb 25 08:32 PM]: Pfam Database: version=35.0 date=2021-11 records=19,632 [Feb 25 08:32 PM]: Downloading Repeat database [Feb 25 08:32 PM]: Downloading: https://osf.io/vp87c/download?version=1 Bytes: 6325661 [Feb 25 08:32 PM]: Building diamond database [Feb 25 08:32 PM]: Repeat Database: version=1.0 date=2022-02-25 records=11,950 [Feb 25 08:32 PM]: Downloading GO Ontology database [Feb 25 08:32 PM]: Downloading: http://purl.obolibrary.org/obo/go.obo Bytes: 33861868 [Feb 25 08:32 PM]: GO ontology version=2022-01-13 date=2022-01-13 records=47,266 [Feb 25 08:32 PM]: Downloading MiBIG Secondary Metabolism database [Feb 25 08:32 PM]: Downloading: https://dl.secondarymetabolites.org/mibig/mibig_prot_seqs_1.4.fasta Bytes: 21244378 [Feb 25 08:42 PM]: Building diamond database [Feb 25 08:42 PM]: MiBIG Database: version=1.4 date=2022-02-25 records=31,023 [Feb 25 08:42 PM]: Downloading InterProScan Mapping file [Feb 25 08:42 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/interpro/interpro.xml.gz Bytes: 32406991 [Feb 25 08:42 PM]: Downloading: ftp://ftp.ebi.ac.uk/pub/databases/interpro/entry.list Bytes: 2163295 [Feb 25 08:43 PM]: InterProScan XML: version=87.0 date=2021-09-30 records=40,037 [Feb 25 08:43 PM]: Downloading pre-computed BUSCO outgroups [Feb 25 08:43 PM]: Downloading: https://osf.io/r9sne/download?version=1 Bytes: 2374032 [Feb 25 08:43 PM]: BUSCO outgroups: version=1.0 date=2022-02-25 records=8 [Feb 25 08:43 PM]: Downloaded curated gene names and product descriptions Traceback (most recent call last): File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 1354, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 1256, in request self._send_request(method, url, body, headers, encode_chunked) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 1302, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 1251, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 1011, in _send_output self.send(msg) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 951, in send self.connect() File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 1418, in connect super().connect() File "/public/home/miniconda3/envs/Fanno/lib/python3.8/http/client.py", line 922, in connect self.sock = self._create_connection( File "/public/home/miniconda3/envs/Fanno/lib/python3.8/socket.py", line 808, in create_connection raise err File "/public/home/miniconda3/envs/Fanno/lib/python3.8/socket.py", line 796, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/public/home/miniconda3/envs/Fanno/bin/funannotate", line 10, in sys.exit(main()) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main mod.main(arguments) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 710, in main curatedDB(DatabaseInfo, args.force, args=args) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 466, in curatedDB download(DBURL.get('gene2product'), curatedFile) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 70, in download u = urlopen(url) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 525, in open response = self._open(req, data) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 542, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 502, in _call_chain result = func(*args) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 1397, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/public/home/miniconda3/envs/Fanno/lib/python3.8/urllib/request.py", line 1357, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

Next, I ran the same command again. The logfile as follow: [Feb 25 09:13 PM]: OS: CentOS Linux 7, 24 cores, ~ 197 GB RAM. Python: 3.8.12 [Feb 25 09:13 PM]: Running 1.8.9 [Feb 25 09:13 PM]: Database location: Fanno_db/ [Feb 25 09:13 PM]: Retrieving download links from GitHub Repo [Feb 25 09:13 PM]: Unable to download links from GitHub, using funannotate version specific links [Feb 25 09:13 PM]: Parsing Augustus pre-trained species and porting to funannotate Traceback (most recent call last): File "/public/home/miniconda3/envs/Fanno/bin/funannotate", line 10, in sys.exit(main()) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main mod.main(arguments) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 694, in main meropsDB(DatabaseInfo, args.force, args=args) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 139, in meropsDB type, name, version, date, records, checksum = info.get('merops') TypeError: cannot unpack non-iterable NoneType object

Currently, I have the following files in my Fanno_db folder: busco_outgroups.tar.gz merops_scan.lib dbCAN.changelog.txt mibig.dmnd dbCAN-fam-HMMs.txt mibig.fa dbCAN.hmm outgroups dbCAN.hmm.h3f Pfam-A.clans.tsv dbCAN.hmm.h3i Pfam-A.hmm dbCAN.hmm.h3m Pfam-A.hmm.h3f dbCAN.hmm.h3p Pfam-A.hmm.h3i funannotate.repeat.proteins.fa Pfam-A.hmm.h3m funannotate.repeat.proteins.fa.tar.gz Pfam-A.hmm.h3p funannotate.repeats.reformat.fa Pfam.version go.obo repeats.dmnd interpro.tsv trained_species interpro.xml uniprot.dmnd merops.dmnd uniprot.release-date.txt merops.formatted.fa uniprot_sprot.fasta

WJunHao avatar Feb 25 '22 15:02 WJunHao

Looks like you are behind a firewall or something and python requests isn't using proxy server correctly or there is no proxy. Try with the --wget option.

nextgenusfs avatar Feb 25 '22 16:02 nextgenusfs

Thank you for your reply. Yes, I tried the --wget option. The order is funannotate setup --wget -d Fanno_db/ --install all. The logfile as follow: [Feb 26 06:04 PM]: OS: CentOS Linux 7, 24 cores, ~ 197 GB RAM. Python: 3.8.12 [Feb 26 06:04 PM]: Running 1.8.9 [Feb 26 06:04 PM]: Database location: Fanno_db/ [Feb 26 06:04 PM]: Retrieving download links from GitHub Repo [Feb 26 06:04 PM]: Unable to download links from GitHub, using funannotate version specific links [Feb 26 06:04 PM]: Parsing Augustus pre-trained species and porting to funannotate Traceback (most recent call last): File "/public/home/miniconda3/envs/Fanno/bin/funannotate", line 10, in sys.exit(main()) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main mod.main(arguments) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 694, in main meropsDB(DatabaseInfo, args.force, args=args) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 139, in meropsDB type, name, version, date, records, checksum = info.get('merops') TypeError: cannot unpack non-iterable NoneType object

I saw the freedom to install any of the databases merops, uniprot, dbCAN, pfam, repeatats, go, mibig, interpro, busco_outgroups, gene2product. As I did not see a gene2product file in the Fanno_db folder, I tried the following command : funannotate setup --wget -d Fanno_db/ --install gene2product. The logfile as follow: [Feb 26 05:58 PM]: OS: CentOS Linux 7, 24 cores, ~ 197 GB RAM. Python: 3.8.12 [Feb 26 05:58 PM]: Running 1.8.9 [Feb 26 05:58 PM]: Database location: Fanno_db/ [Feb 26 05:58 PM]: Retrieving download links from GitHub Repo [Feb 26 05:58 PM]: Unable to download links from GitHub, using funannotate version specific links [Feb 26 05:58 PM]: Parsing Augustus pre-trained species and porting to funannotate [Feb 26 05:58 PM]: Downloaded curated gene names and product descriptions --2022-02-26 17:58:08-- https://raw.githubusercontent.com/nextgenusfs/gene2product/master/ncbi_cleaned_gene_products.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 0.0.0.0, :: Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|0.0.0.0|:443... failed: Connection refused. Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|::|:443... failed: Connection refused. Traceback (most recent call last): File "/public/home/tanglei/.wjh/miniconda3/envs/Fanno/bin/funannotate", line 10, in sys.exit(main()) File "/public/home/tanglei/.wjh/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main mod.main(arguments) File "/public/home/tanglei/.wjh/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 710, in main curatedDB(DatabaseInfo, args.force, args=args) File "/public/home/tanglei/.wjh/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 479, in curatedDB curdate = datetime.datetime.strptime( File "/public/home/tanglei/.wjh/miniconda3/envs/Fanno/lib/python3.8/_strptime.py", line 568, in _strptime_datetime tt, fraction, gmtoff_fraction = _strptime(data_string, format) File "/public/home/tanglei/.wjh/miniconda3/envs/Fanno/lib/python3.8/_strptime.py", line 349, in _strptime raise ValueError("time data %r does not match format %r" % ValueError: time data '' does not match format '%m-%d-%Y'

Based on the provided url, I ran the following command: wget https://raw.githubusercontent.com/nextgenusfs/gene2product/master/ncbi_cleaned_gene_products.txt. The logfile as follow: --2022-02-26 18:04:58-- https://raw.githubusercontent.com/nextgenusfs/gene2product/master/ncbi_cleaned_gene_products.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 0.0.0.0, :: Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|0.0.0.0|:443... failed: Connection refused. Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|::|:443... failed: Connection refused.

So it could be that the server's firewall or something else is blocking access because my windows computer can open the URL. I saved it under the default name of ncbi_cleaned_gene_products.txt and placed it in the Fanno_db folder. Can other databases be downloaded in this way? Do I need to change the name of the file or other actions?

WJunHao avatar Feb 26 '22 10:02 WJunHao

Hi. I ran the test command. The order is funannotate test -t predict. The logfile as follow: Running funannotate predict unit testing CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 2 --species Awesome testicus #########################################################

[Feb 27 02:48 PM]: OS: CentOS Linux 7, 24 cores, ~ 197 GB RAM. Python: 3.8.12 [Feb 27 02:48 PM]: Running funannotate v1.8.9 [Feb 27 02:48 PM]: ERROR: dikarya busco database is not found, install with funannotate setup -b dikarya ######################################################### Traceback (most recent call last): File "/public/home/miniconda3/envs/Fanno/bin/funannotate", line 10, in sys.exit(main()) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main mod.main(arguments) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/test.py", line 405, in main runPredictTest(args) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/test.py", line 160, in runPredictTest assert 1500 <= countGFFgenes(os.path.join( File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/test.py", line 45, in countGFFgenes with open(input, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: 'test-predict_e359a1d8-bccd-4bc4-84ca-3701d451a8cb/annotate/predict_results/Awesome_testicus.gff3'

It may be a problem with the busco database. Then, I ran the funannotate database --show-buscos. The logfile as follows: BUSCO DB tree: (# of models)

eukaryota (303) metazoa (978) nematoda (982) arthropoda (1066) insecta (1658) endopterygota (2442) hymenoptera (4415) diptera (2799) vertebrata (2586) actinopterygii (4584) tetrapoda (3950) aves (4915) mammalia (4104) euarchontoglires (6192) laurasiatheria (6253) fungi (290) dikarya (1312) ascomycota (1315) pezizomycotina (3156) eurotiomycetes (4046) sordariomycetes (3725) saccharomycetes (1759) saccharomycetales (1711) basidiomycota (1335) microsporidia (518) embryophyta (1440) protists (215) alveolata_stramenophiles (234) When I try to reinstall the busco database `funannotate setup -b all -d Fanno_db/ -w", the same error reappears. [Feb 27 03:00 PM]: OS: CentOS Linux 7, 24 cores, ~ 197 GB RAM. Python: 3.8.12 [Feb 27 03:00 PM]: Running 1.8.9 [Feb 27 03:00 PM]: Database location: Fanno_db/ [Feb 27 03:00 PM]: Retrieving download links from GitHub Repo [Feb 27 03:00 PM]: Unable to download links from GitHub, using funannotate version specific links [Feb 27 03:00 PM]: Parsing Augustus pre-trained species and porting to funannotate Traceback (most recent call last): File "/public/home/miniconda3/envs/Fanno/bin/funannotate", line 10, in sys.exit(main()) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main mod.main(arguments) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 694, in main meropsDB(DatabaseInfo, args.force, args=args) File "/public/home/miniconda3/envs/Fanno/lib/python3.8/site-packages/funannotate/setupDB.py", line 139, in meropsDB type, name, version, date, records, checksum = info.get('merops') TypeError: cannot unpack non-iterable NoneType object

Are there other links to these databases that can be downloaded?

WJunHao avatar Feb 27 '22 07:02 WJunHao

The download links are here: https://github.com/nextgenusfs/funannotate/blob/master/funannotate/downloads.json. You should be able to download them in the setup directory, if the file is present it will use it instead of trying to download it.

nextgenusfs avatar Mar 02 '22 19:03 nextgenusfs

I had the same problem and it was solved. Add the correct ip of raw.githubusercontent.com to the host file

https://www.cnblogs.com/sinferwu/p/12726833.html

zjhcarbonic avatar Aug 12 '22 12:08 zjhcarbonic