Can I concatenate EDTA library and Repbase library manually?
Hi Professor Ou:
I would like to ask whether I can concatenate EDTA library and Repbase library, then run RepeatMasker manually? I was glad to find EDTA can significantly reduce the percentage of unclassified in results. However, compare to the results from repeatmodeler customed library + Repbase library, the percentage of repeat mask in EDTA's results is lower. Here are my results:
When I run EDTA with following commandperl ../EDTA.pl --genome genome.fa --overwrite 1 --sensitive 1 --anno 1 --threads 30 , I got the result(~31%):
But concatenate the Repbase and RepeatModeler customed library then run Repeatmasker
cat my_genome-families.fa RepeatMaskerLib.fasta > combine.fasta, RepeatMasker -pa 28 -s -lib combine.fasta -dir RMasker -e rmblast my_genomic.fna I got the result(~40%):
And I run Repeatmasker based on RepeatModeler customed library( Repbase library is not included)
RepeatMasker -pa 28 -s -lib my_genome-families.fa -dir RMasker -e rmblast my_genomic.fna, I got the result(~35%):
Is it possible that the lack of repbase library is causing the output to be low? Can I concatenate EDTA library and Repbase library to improve it? Or are there other reasons for the difference? Thanks for your help in advance.
Best regards, Hao
Hello Hao,
It's possible, but also increase your chance to introduce false annotations. please check the extra sequence annotation with your other genome features like genes. Using EDTA with --sensitive 1 automatically activates RepeatModeler for further annotations. If you have a curated library, like Repbase, you can provide it to EDTA through --curatedlib.
Thanks, Shujun