How could SPANN load DEEP-1B (358GB) into memory with only 128GB RAM (in SPANN paper)?
Hi, I'm reading the source code and paper of SPANN.
In /AnnService/src/IndexBuilder/main.cpp, we can see data will be loaded in function DefaultVectorReader::GetVectorSet().
In that function, vectors are loaded in one single ReadBinary() method.
But if the size of origin file exceeds the RAM, how could do that? In SPANN paper, only uses 128GB RAM for DEEP-1B, which has 358GB basepoints file according to big-ann-benchmark.
Hi, have you solved this problem? I also encountered the same.
Hi, have you solved this problem? I also encountered the same.
No, I didn't. And I suppose the team uses more memory footprint than they described in paper? That does not mean they were wrong, the description of memory usage in the paper is all about searching procedure, so maybe in the building step, spann needs more memory space. I guess...
Got it, thanks.
By the way, how much memory does it take to build the SPANN index using the DEEP1B dataset?
By the way, how much memory does it take to build the SPANN index using the DEEP1B dataset?
No idea, if you figure out I'd like to hear it! Thank you in advance.