Summary

I trained an IVFPQ Index using gpu (RTX 2080Ti) using a sample of my data with shape (250K, 128) using FLOAT16, but when I added my data with shape (66.4M, 128)(around 34 GB with float32) to index I got (The data is memory mapped array) :

Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::Tensor<T, Dim, InnerContig, IndexT, PtrTraits>::copyFrom(const faiss::gpu::Tensor<T, Dim, InnerContig, IndexT, PtrTraits>&, cudaStream_t) [with T = float; int Dim = 2; bool InnerContig = true; IndexT = int; PtrTraits = faiss::gpu::traits::DefaultPtrTraits; cudaStream_t = CUstream_st*] at /project/faiss/faiss/gpu/utils/Tensor-inl.cuh:206; details: CUDA error 77 an illegal memory access was encountered
Aborted (core dumped)

Then I did it on cpu without FLOAT16 and write_index to save it, then i tried to read_index and move it to GPU with FLOAT16 i get error:

Faiss assertion 'gpuListSizeInBytes <= (size_t)std::numeric_limits<int>::max()' failed in void faiss::gpu::IVFBase::addEncodedVectorsToList_(int, const void*, const idx_t*, size_t) at /project/faiss/faiss/gpu/impl/IVFBase.cu:352
Aborted (core dumped)

what is the problem?

Platform

OS: Ubuntu 20.04.3 LTX

Faiss version: 1.7.2

Installed from: pip in a conda environment

Faiss compilation options:

Running on:

[ ] CPU
[X] GPU

Interface:

[ ] C++
[X] Python

Reproduction instructions

use_gpu = True
if use_gpu:
        GPU_RESOURCES = faiss.StandardGpuResources()
        GPU_OPTIONS = faiss.GpuClonerOptions()
        GPU_OPTIONS.useFloat16 = True
d=128
index = faiss.IndexFlatL2(d)
code_sz = 64
n_centroids = 256
nbits = 8 
index = faiss.IndexIVFPQ(index, d, n_centroids, code_sz, nbits)
index = faiss.index_cpu_to_gpu(GPU_RESOURCES, 0, index, GPU_OPTIONS)
db_train= np.memmap(f'./db_train.mm',
                            dtype='float32',
                            mode='w+',
                            shape=(250000, d))
db_train = np.random.random((250000, d)).astype('float32')
index.train(db_train.copy()) #db_train.copy() so that I am using numpy array not memmap
db_train.flush()
db_train._mmap.close()
del db_train
index.nprobe = 40
db = np.memmap(f'./db_total.mm',
                            dtype='float32',
                            mode='w+',
                            shape=(66400000, d))
db[:] = np.random.random((66400000, d)).astype('float32')
index.add(db)
faiss.write_index('pretrained.index')

The second part ehn training with CPU:

index = faiss.read_index('pretrained.index')
GPU_RESOURCES = faiss.StandardGpuResources()
GPU_OPTIONS = faiss.GpuClonerOptions()
GPU_OPTIONS.useFloat16 = True
index = faiss.index_cpu_to_gpu(GPU_RESOURCES, 0, index, GPU_OPTIONS)

I get the second error

Sep 02 '23 07:09 mhmd-mst

I suspect you are not running the code above in sequence because

db_train = np.random.random((250000, d)).astype('float32')
...
db_train.flush()
db_train._mmap.close()

cannot work

Sep 06 '23 05:09 mdouze

what do you mean cannot work? consider db_train [:] = np.random.random if that is what you mean

Sep 06 '23 06:09 mhmd-mst

Problem adding data to trained IVFPQ Index

Summary

Platform

Reproduction instructions