1. PVE驱动
在pve中安装:
1.注释企业订阅源, 添加非订阅源
vim /etc/apt/sources.list.d/pve-enterprise.list
# deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription
然后执行 apt update && apt install build-essential pve-headers-$(uname -r)
2.禁用nouveau
# 屏蔽nouveau 添加一句 blacklist nouveau
vim /etc/modprobe.d/blacklist.conf
# 修改生效
update-initramfs -u
# 重启
reboot
3.直接安装CADU Toolkit
根据官网教程一步步安装
验证驱动:
nvidia-smi
验证cuda:
nvcc --version
如果没有 nvcc
命令,需要将 /usr/local/cuda/bin
添加到环境变量
export PATH=/usr/local/cuda/bin:$PATH
检查内容
ls -la /dev/dri/
ls -la /dev/dri/
输出:
total 0
drwxr-xr-x 3 root root 120 Apr 27 21:02 .
drwxr-xr-x 23 root root 5760 Apr 28 01:00 ..
drwxr-xr-x 2 root root 100 Apr 27 21:02 by-path
crw-rw---- 1 root video 226, 0 Apr 27 19:56 card0
crw-rw---- 1 root video 226, 1 Apr 27 21:02 card1
crw-rw---- 1 root render 226, 128 Apr 27 21:02 renderD128
ls -la /dev/dri/by-path/
ls -la /dev/dri/by-path/
输出:
total 0
drwxr-xr-x 2 root root 100 Apr 27 21:02 .
drwxr-xr-x 3 root root 120 Apr 27 21:02 ..
lrwxrwxrwx 1 root root 8 Apr 27 21:02 pci-0000:81:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Apr 27 21:02 pci-0000:81:00.0-render -> ../renderD128
lrwxrwxrwx 1 root root 8 Apr 27 19:56 pci-0000:c2:00.0-card -> ../card0
ls -la /dev/nvidia*
ls -la /dev/nvidia*
输出:
crw-rw-rw- 1 root root 511, 0 Apr 28 16:07 /dev/nvidia-uvm
crw-rw-rw- 1 root root 511, 1 Apr 28 16:07 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195, 254 Apr 28 16:05 /dev/nvidia-modeset
crw-rw-rw- 1 root root 195, 0 Apr 28 16:05 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr 28 16:05 /dev/nvidiactl
/dev/nvidia-caps:
total 0
cr--r--r-- 1 root root 237, 2 Apr 28 16:07 nvidia-cap2
cr-------- 1 root root 237, 1 Apr 28 16:07 nvidia-cap1
2. LXC驱动
修改lxc配置文件
vim /etc/pve/lxc/xxx.conf
加入以下内容
其中195对应上面 ls -la /dev/nvidia*
输出的内容, 以此类推
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 511:* rwm
lxc.cgroup2.devices.allow: c 236:* rwm
lxc.cgroup2.devices.allow: c 226:* rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-caps nvidia-caps none bind,optional,create=dir
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
重启lxc容器
lxc内安装驱动:
直接安装CADU Toolkit
验证驱动:
nvidia-smi
验证cuda:
git clone https://github.com/NVIDIA/cuda-samples.git
cd cuda-samples/Samples/1_Utilities/deviceQuery/
make
./deviceQuery
cuda-samples/Samples/1_Utilities/deviceQuery/deviceQuery