vg icon indicating copy to clipboard operation
vg copied to clipboard

Distanceless distance indexing is very slow

Open xchang1 opened this issue 5 months ago • 1 comments

I was trying to index a cabbage pangenome graph and it took ~25 days. The final distanceless distance index was 2.6 GB. There were 16,734,435 snarls, the largest has 227,992 children, and the maximum depth is 14. This is pretty big and complex but it is still way slower than I'd expect so I'm worried that it's doing extra distance-finding work.

xchang1 avatar Aug 19 '25 08:08 xchang1

That is indeed suspicious.

Do you have a sense yet of how much time is going into the IntegratedSnarlFinder (which I guess is meant to be most of what a distanceless distance index build involves) versus other stuff?

adamnovak avatar Aug 27 '25 15:08 adamnovak