HMM file format error when using custom library in FASTA format
I am trying to run RepeatMasker with a custom library downloaded from a database. It's in fasta format:
>ORSgTETNOOT01930 gi|14578149|nt85894-86033 unclassified transposon
GGCTGCGTTTAGATCCAAAGTTTGGATCCAAACTTCAGTCCTTTTCCATCACATCAACCT
GTCATACACATAAAACTTTTCAGTCACATCATCTTTAATTTCAACCAAAATCCAAACTTT
GCGCTGAACTAAACACAGAC
>ORSgCMCM00201320 gi|28460675|nt108249-108396 putative centromere sequence, CentO/CentC-like
ATATTAGCCCACACGGGTGCGATGTTTTTGACCAGAATGAAAATGTTCAAAAAACACCAA
AGCATGATTTTTGGACTTATTGGAGTGTATTGGGTGCGTTCGTGGCAAATACTCAATTCA
TGATTCGCGCGGCGAACTTTTGTCAATT
My code:
RepeatMasker -xsmall -pa 12 -lib TIGR_Oryza_Repeats_v_3_3.fna azucena.fna
I am getting an error:
RepeatMasker version 4.1.2-p1
Search Engine: HMMER [ 3.3.2 (Nov 2020) ]
RepeatMasker::createLib(): Error invoking /share/apps/hmmer/3.3.2/intel/bin/hmmpress on file /scratch/ak8725/genomes/RM_3160160.FriMay261649112023/TIGR_Oryza_Repeats_v_3_3.fna.
An additional hmmPress.log file is created:
Error: File format problem in trying to open HMM file /scratch/ak8725/genomes/RM_3145390.FriMay261638102023/TIGR_Oryza_Repeats_v_3_3.fna.
Format tag is '>ORSgTETNOOT01930': unrecognized.
Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported.
I don't understand what I did wrong. The documentation says that -lib option expects a fasta formatted library, which I provided.
The problem is that your RepeatMasker installation was configured to use nhmmer as the default search engine but you are trying to search using consensus library instead of a profile Hidden Markov Model library. You can use the "-engine" option to change the search engine to one that will work with consensus sequences ( e.g "-engine crossmatch" or "-engine ncbi" if you have installed phrap/crossmatch or rmblast respecitvely ).