libcuda.so is not in found but libcuda.so.1 is. · Issue #8587 · microsoft/WSL (original) (raw)

Version

Microsoft Windows [Version 10.0.22000.778]

WSL Version

Kernel Version

5.10.102.1

Distro Version

Ubuntu 22.04

Other Software

cuda 10.7 , miniconda , pytorch

Repro Steps

Install CUDA as described on nvidia website using option 1:
https://docs.nvidia.com/cuda/wsl-user-guide/index.html

then install python3.9 and try the following code that shows the issue :

$ python
Python 3.9.12 (main, Apr  5 2022, 06:56:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import cdll
>>> cdll.LoadLibrary('libcuda.so.1')
<CDLL 'libcuda.so.1', handle 562f22ae4160 at 0x7f74fb662c10>
>>> cdll.LoadLibrary('libcuda.so')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/piotr/miniconda3/lib/python3.9/ctypes/__init__.py", line 460, in LoadLibrary
    return self._dlltype(name)
  File "/home/piotr/miniconda3/lib/python3.9/ctypes/__init__.py", line 382, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcuda.so: cannot open shared object file: No such file or directory

I'm loading the library manually so that you don't have to install pytorch and timm, and then try to run training of 'resnet26d', as this triggers the loading of libcudnn_cnn_infer.so.8 which is linked against libcuda.so instead of libcuda.so.1 .

Expected Behavior

It should be possible to load libcuda.so without specifing /usr/lib/wsl/lib in the LD_LIBRARY_PATH.

>>> cdll.LoadLibrary('libcuda.so')
should return a library handle

Actual Behavior

Unfortunately somehow the ldconfig fails to cache the libcuda.so and subsequently it does not find it unless /usr/lib/wsl/lib is in the LD_LIBRARY_PATH.

Diagnostic Logs

libcuda.so is in the /usr/lib/wsl/lib along with libcuda.so.1 and libcuda.so.1.1 . All files are the same.

It looks like ldconfig somehow does not save the path to libcuda.so, you can observe this by running:

$  ldconfig  /usr/lib/wsl/lib -v  -n
libnvwgf2umx.so -> libnvwgf2umx.so
        libnvidia-opticalflow.so.1 -> libnvidia-opticalflow.so.1
        libnvidia-ml.so.1 -> libnvidia-ml.so.1
        libnvidia-encode.so.1 -> libnvidia-encode.so.1
        libnvdxdlkernels.so -> libnvdxdlkernels.so
        libnvcuvid.so.1 -> libnvcuvid.so.1
        libdxcore.so -> libdxcore.so
        libd3d12core.so -> libd3d12core.so
        libd3d12.so -> libd3d12.so
/sbin/ldconfig.real: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link

        libcuda.so.1 -> libcuda.so.1.1

The libcuda.so is not being picked up, even though strace shows that ldconfig is reading that file:
strace ldconfig /usr/lib/wsl/lib -n

newfstatat(AT_FDCWD, "/usr/lib/wsl/lib", {st_mode=S_IFDIR|0555, st_size=4096, ...}, 0) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/glibc-hwcaps", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/cache/ldconfig/aux-cache", O_RDONLY) = -1 EACCES (Permission denied)
openat(AT_FDCWD, "/usr/lib/wsl/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_EMPTY_PATH) = 0
getdents64(3, 0x555556da4ee0 /* 18 entries */, 32768) = 672
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so", O_RDONLY) = 4
newfstatat(4, "", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 141464, PROT_READ, MAP_SHARED, 4, 0) = 0x7fe867b6e000
munmap(0x7fe867b6e000, 141464)          = 0
close(4)                                = 0
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1", O_RDONLY) = 4
newfstatat(4, "", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 141464, PROT_READ, MAP_SHARED, 4, 0) = 0x7fe867b6e000
munmap(0x7fe867b6e000, 141464)          = 0
close(4)                                = 0
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1.1", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1.1", O_RDONLY) = 4
newfstatat(4, "", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 141464, PROT_READ, MAP_SHARED, 4, 0) = 0x7fe867b6e000
munmap(0x7fe867b6e000, 141464)          = 0
close(4)                                = 0
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libd3d12.so"