-
Notifications
You must be signed in to change notification settings - Fork 3.8k
store_dataset CAGRA parameter #4274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The |
To answer your original question, at least some of your dataset will have to be loaded onto the GPU for IVF-PQ build and search, during the CAGRA graph build. |
@tarang-jain Thanks for responding on this. I think we understand that for doing clustering some parts of dataset needs to be loaded in GPU memeory, but this is where it is not consistent with
This is very bizarre. |
@bshethmeta this issue should not be closed. As I think this is a bug in the code. |
The memory usage that reported above is higher than what it should be. Below I show the expected memory usage when we use cuvs natively (using cuvs-bench). I will need to repeat the same test with the faiss integrated version of cuvs to see where the additional allocations happens. I suspect some temporary allocations grow larger than expected. Here is the GPU memory usage for CAGRA index building when the dataset is in host memory. Initially we subsample the dataset, and that is used for k-means clustering. As discussed above, the allocation size can be controlled by the Note that the IVF index has |
@tfeher @tarang-jain Thanks for the responses. I am most confused on why we see such a dramatic difference in peak GPU memory usage when loading the dataset into CPU memory v.s. storing on disk using |
@tfeher |
@rchitale7 you are right that from GPU memory usage point of view, it should not matter whether the dataset is accessed using Note that the When If we run a CAGRA search on the index created by store_dataset=False, then that will load the dataset on the GPU. The memory usage that you report is higher than expected, please see my answer here: rapidsai/cuvs#566 (comment) Could you run your test with RMM logging enabled (as described in the linked answer) to see when these allocations happen? |
@rchitale7 can you please run the benchmarks as suggested by @tfeher |
@tfeher did you try running the code(referenced here: rapidsai/cuvs#566 (comment)) without memory mapped file? Because its the non memory mapped code which was spiking the memory. |
I added the |
Hi,
I am trying to build a GPU index using
GpuIndexCagra
, and was under the impression that thestore_dataset
parameter prevents the dataset from being attached to the index. I am converting the index to HNSW afterwards, so I don't want to load the dataset into GPU memory. Recently, there was a PR that seemed to address this: #4173.However, for a 6.144 GB example dataset, I noticed that the GPU memory spiked to as high as 10.5 GB, when I monitored the GPU usage with
nvidia-smi
in the background. The code I'm using to test is here: https://github.com/navneet1v/VectorSearchForge/tree/main/cuvs_benchmarks. Specifically, this function: https://github.com/navneet1v/VectorSearchForge/blob/main/cuvs_benchmarks/main.py#L324 is used to build the index on a GPUInterestingly, when I used the
numpy
mmap
feature to load the dataset, I did not see the GPU memory exceed 5.039 GB. This was regardless of the value I set thestore_dataset
parameter. It looks like CAGRA supports keeping the dataset on disk, so that is probably the reason why the GPU memory doesn't spike. However, we want to see if it's possible to keep the dataset entirely in CPU memory, without loading it into GPU memory and without using disk. Is thestore_dataset
parameter supposed to do this? If not, is there any other way to do this with the faiss python API? Please let me know, thank you!Additional Background
Faiss version: We are using Faiss as a git submodule, with version 1.10.0, and the submodule is pointed to commit df6a8f6
df6a8f6
OS version:
Type of GPU:
EC2 Instance Type: g5.2xlarge
Reproduction Instructions
git
anddocker
installednvidia
developer tools installed, such asnvidia-smi
andnvidia-container-toolkit
cd
intocuvs_benchmarks
folder, and create a temp directory to store the faiss graph files:In a separate terminal, run
nvidia-smi
to monitor the GPU memory:For loading the
numpy
dataset with mmap, I added the following lines below https://github.com/navneet1v/VectorSearchForge/blob/main/cuvs_benchmarks/main.py#L253:The text was updated successfully, but these errors were encountered: