LDCache does not include libcuda.so

One of the things that the NVIDIA Container Toolkit does is update the ldcache in the container so as to allow applications to discover the host driver libraries that have been injected. We also create (some) `.so` symlinks to match the files tracked by the driver installation. These point to the `SONAME` symlinks. For example: `libcuda.so` -> `libcuda.so.1` -> `libcuda.so.RM_VERSION`. We create the `libcuda.so` symlinks *before* we run `ldconfig`, *but* `libcuda.so` is not present in the ldcache since we rely on running `ldconfig` to create the `libcuda.so.1` symlink. This means that the ldcache in the container once it starts does not match expectations (i.e. the host state).

For example, on a host with the driver installed we have:
```
$ ldconfig -p | grep libcuda
        libcuda.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcuda.so.1
        libcuda.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libcuda.so
```

In a container:
```
$ docker run --rm -ti -e NVIDIA_VISIBLE_DEVICES=runtime.nvidia.com/gpu=all ubuntu
root@93984b5c459c:/# ldconfig -p | grep libcuda
        libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1
```

If we run `ldconfig` in the container we see the following:
```
root@93984b5c459c:/# ldconfig
root@93984b5c459c:/# ldconfig -p | grep libcuda
        libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1
        libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so
```
which matches the host state.

This also holds for the "legacy" code path since the symlink chain is only completed by running `ldconfig` once.

This seems inocent enough, but has the side effect that applications that run `dlopen("libcuda.so", RTLD_LAZY);` may not find the library if it is not in the standard library path (this could be the case for CDI).

A simple workaround is to inject the `update-ldcache` hook *twice*, but we may want to consider a two phase approach where we first run `ldconfig` with the `-N` flag to *only* update the links and then run `ldconfig` to update the cache.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LDCache does not include libcuda.so #944

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LDCache does not include libcuda.so #944

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions