Description
Hi,
I am trying to build a GPU index using GpuIndexCagra
, and was under the impression that the store_dataset
parameter prevents the dataset from being attached to the index. I am converting the index to HNSW afterwards, so I don't want to load the dataset into GPU memory. Recently, there was a PR that seemed to address this: #4173.
However, for a 6.144 GB example dataset, I noticed that the GPU memory spiked to as high as 10.5 GB, when I monitored the GPU usage with nvidia-smi
in the background. The code I'm using to test is here: https://github.com/navneet1v/VectorSearchForge/tree/main/cuvs_benchmarks. Specifically, this function: https://github.com/navneet1v/VectorSearchForge/blob/main/cuvs_benchmarks/main.py#L324 is used to build the index on a GPU
Interestingly, when I used the numpy
mmap
feature to load the dataset, I did not see the GPU memory exceed 5.039 GB. This was regardless of the value I set the store_dataset
parameter. It looks like CAGRA supports keeping the dataset on disk, so that is probably the reason why the GPU memory doesn't spike. However, we want to see if it's possible to keep the dataset entirely in CPU memory, without loading it into GPU memory and without using disk. Is the store_dataset
parameter supposed to do this? If not, is there any other way to do this with the faiss python API? Please let me know, thank you!
Additional Background
Faiss version: We are using Faiss as a git submodule, with version 1.10.0, and the submodule is pointed to commit df6a8f6
OS version:
NAME="Amazon Linux"
VERSION="2023"
ID="amzn"
ID_LIKE="fedora"
VERSION_ID="2023"
PLATFORM_ID="platform:al2023"
PRETTY_NAME="Amazon Linux 2023.6.20250218"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2023"
HOME_URL="https://aws.amazon.com/linux/amazon-linux-2023/"
DOCUMENTATION_URL="https://docs.aws.amazon.com/linux/"
SUPPORT_URL="https://aws.amazon.com/premiumsupport/"
BUG_REPORT_URL="https://github.com/amazonlinux/amazon-linux-2023"
VENDOR_NAME="AWS"
VENDOR_URL="https://aws.amazon.com/"
SUPPORT_END="2029-06-30"
Type of GPU:
00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1)
EC2 Instance Type: g5.2xlarge
Reproduction Instructions
- On a server with GPUs, clone https://github.com/navneet1v/VectorSearchForge
- Server must have
git
anddocker
installed - Server must have
nvidia
developer tools installed, such asnvidia-smi
andnvidia-container-toolkit
- Server must have
cd
intocuvs_benchmarks
folder, and create a temp directory to store the faiss graph files:
mkdir ./benchmarks_files
chmod 777 ./benchmarks_files
- Build the docker image:
docker build -t <your_image_name> .
- Run the image:
docker run -v ./benchmarks_files:/tmp/files --gpus all <your_image_name>
In a separate terminal, run nvidia-smi
to monitor the GPU memory:
nvidia-smi --query-gpu=timestamp,utilization.gpu,memory.used,temperature.gpu --format=csv -l 1
For loading the numpy
dataset with mmap, I added the following lines below https://github.com/navneet1v/VectorSearchForge/blob/main/cuvs_benchmarks/main.py#L253:
# Line 1-253 code above ...
np.save("array.npy",xb)
del xb
xb = np.load("array.npy", mmap_mode='r+')
# rest of code below ...