SPTAG icon indicating copy to clipboard operation
SPTAG copied to clipboard

How could SPANN load DEEP-1B (358GB) into memory with only 128GB RAM (in SPANN paper)?

Open matchyc opened this issue 3 years ago • 7 comments

Hi, I'm reading the source code and paper of SPANN. In /AnnService/src/IndexBuilder/main.cpp, we can see data will be loaded in function DefaultVectorReader::GetVectorSet(). In that function, vectors are loaded in one single ReadBinary() method.

But if the size of origin file exceeds the RAM, how could do that? In SPANN paper, only uses 128GB RAM for DEEP-1B, which has 358GB basepoints file according to big-ann-benchmark.

matchyc avatar Mar 07 '22 08:03 matchyc

Hi, have you solved this problem? I also encountered the same.

LLLjun avatar Aug 12 '22 08:08 LLLjun

Hi, have you solved this problem? I also encountered the same.

No, I didn't. And I suppose the team uses more memory footprint than they described in paper? That does not mean they were wrong, the description of memory usage in the paper is all about searching procedure, so maybe in the building step, spann needs more memory space. I guess...

matchyc avatar Aug 12 '22 08:08 matchyc

Got it, thanks.

LLLjun avatar Aug 12 '22 08:08 LLLjun

By the way, how much memory does it take to build the SPANN index using the DEEP1B dataset?

LLLjun avatar Aug 12 '22 09:08 LLLjun

By the way, how much memory does it take to build the SPANN index using the DEEP1B dataset?

No idea, if you figure out I'd like to hear it! Thank you in advance.

matchyc avatar Aug 12 '22 14:08 matchyc