Feat: Pangenome graph module
This is a request for a new feature rule to use PPanGGOLiN to create pagenome graphs for the genome projects.
I think we should use the Roary-defined Gene Famillies as input to ppanggolin. This would make sure the Gene Family IDs are consistent with EggNOG and other annotations.
This can be achieved by providing your gene families.
Step 1: Use gff annotations just like the roary input
ppanggolin annotate --anno ORGANISM_ANNOTATION_LIST
Step 2: Provide gene families
ppanggolin cluster -p pangenome.h5 --clusters MY_CLUSTERS_FILE
MY_CLUSTER_FILE should be created from Roary output.
Experimental feature are now available in: https://github.com/NBChub/bgcflow/tree/ppanggolin3
Usage:
- checkout to the experimental branch:
git checkout ppanggolin3
- add
ppanggolintoTRUEin the project config file - run the workflow using:
bgcflow run --snakefile workflow/Ppanggolin -n
Thanks. I will try this and let you know