libcuda.so is not in found but libcuda.so.1 is. · Issue #8587 · microsoft/WSL (original) (raw)
Version
Microsoft Windows [Version 10.0.22000.778]
WSL Version
- WSL 2
- WSL 1
Kernel Version
5.10.102.1
Distro Version
Ubuntu 22.04
Other Software
cuda 10.7 , miniconda , pytorch
Repro Steps
Install CUDA as described on nvidia website using option 1:
https://docs.nvidia.com/cuda/wsl-user-guide/index.html
then install python3.9 and try the following code that shows the issue :
$ python
Python 3.9.12 (main, Apr 5 2022, 06:56:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import cdll
>>> cdll.LoadLibrary('libcuda.so.1')
<CDLL 'libcuda.so.1', handle 562f22ae4160 at 0x7f74fb662c10>
>>> cdll.LoadLibrary('libcuda.so')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/piotr/miniconda3/lib/python3.9/ctypes/__init__.py", line 460, in LoadLibrary
return self._dlltype(name)
File "/home/piotr/miniconda3/lib/python3.9/ctypes/__init__.py", line 382, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcuda.so: cannot open shared object file: No such file or directory
I'm loading the library manually so that you don't have to install pytorch and timm, and then try to run training of 'resnet26d', as this triggers the loading of libcudnn_cnn_infer.so.8
which is linked against libcuda.so instead of libcuda.so.1 .
Expected Behavior
It should be possible to load libcuda.so without specifing /usr/lib/wsl/lib in the LD_LIBRARY_PATH.
>>> cdll.LoadLibrary('libcuda.so')
should return a library handle
Actual Behavior
Unfortunately somehow the ldconfig fails to cache the libcuda.so and subsequently it does not find it unless /usr/lib/wsl/lib is in the LD_LIBRARY_PATH.
Diagnostic Logs
libcuda.so is in the /usr/lib/wsl/lib along with libcuda.so.1 and libcuda.so.1.1 . All files are the same.
It looks like ldconfig somehow does not save the path to libcuda.so, you can observe this by running:
$ ldconfig /usr/lib/wsl/lib -v -n
libnvwgf2umx.so -> libnvwgf2umx.so
libnvidia-opticalflow.so.1 -> libnvidia-opticalflow.so.1
libnvidia-ml.so.1 -> libnvidia-ml.so.1
libnvidia-encode.so.1 -> libnvidia-encode.so.1
libnvdxdlkernels.so -> libnvdxdlkernels.so
libnvcuvid.so.1 -> libnvcuvid.so.1
libdxcore.so -> libdxcore.so
libd3d12core.so -> libd3d12core.so
libd3d12.so -> libd3d12.so
/sbin/ldconfig.real: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link
libcuda.so.1 -> libcuda.so.1.1
The libcuda.so is not being picked up, even though strace shows that ldconfig is reading that file:
strace ldconfig /usr/lib/wsl/lib -n
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib", {st_mode=S_IFDIR|0555, st_size=4096, ...}, 0) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/glibc-hwcaps", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/var/cache/ldconfig/aux-cache", O_RDONLY) = -1 EACCES (Permission denied)
openat(AT_FDCWD, "/usr/lib/wsl/lib", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
newfstatat(3, "", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_EMPTY_PATH) = 0
getdents64(3, 0x555556da4ee0 /* 18 entries */, 32768) = 672
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so", O_RDONLY) = 4
newfstatat(4, "", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 141464, PROT_READ, MAP_SHARED, 4, 0) = 0x7fe867b6e000
munmap(0x7fe867b6e000, 141464) = 0
close(4) = 0
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1", O_RDONLY) = 4
newfstatat(4, "", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 141464, PROT_READ, MAP_SHARED, 4, 0) = 0x7fe867b6e000
munmap(0x7fe867b6e000, 141464) = 0
close(4) = 0
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1.1", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/usr/lib/wsl/lib/libcuda.so.1.1", O_RDONLY) = 4
newfstatat(4, "", {st_mode=S_IFREG|0555, st_size=141464, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 141464, PROT_READ, MAP_SHARED, 4, 0) = 0x7fe867b6e000
munmap(0x7fe867b6e000, 141464) = 0
close(4) = 0
newfstatat(AT_FDCWD, "/usr/lib/wsl/lib/libd3d12.so"