Proxmox Help - Můj blog o práci a zábavě

https://gist.github.com/Impact123/3dbd7e0ddaf47c5539708a9cbcaab9e3
# Proxmox VE tips

Just some tips I gathered over time. All in one easily reachable place so I can share it wherever I want.    
  
Please note that unless you see a shebang `(#!/...)` these code blocks are usually meant to be copy & pasted directly into the shell.    Some of the steps will not work if you run part of them in a script and copy paste other ones as they rely on variables set before.  
The `{` and `}` surrounding some scripts are meant to avoid poisoning your bash history with individual commands, etc. You can ignore them if you manually copy paste the individual commands.   
I chose to write things "in the open" that way so there's still some control and things don't become a black box. 

## Table of contents

- [Proxmox VE tips](#proxmox-ve-tips)
  - [Table of contents](#table-of-contents)
  - [Discard](#discard)
    - [CT](#ct)
    - [VM](#vm)
  - [Preventing a full storage](#preventing-a-full-storage)
    - [Note about paths](#note-about-paths)
  - [Useful installer shortcuts/tips](#useful-installer-shortcutstips)
  - [Temporary kernel arguments](#temporary-kernel-arguments)
    - [Examples of how that selection can look like for PVE with grub/systemd-boot and PBS installer](#examples-of-how-that-selection-can-look-like-for-pve-with-grubsystemd-boot-and-pbs-installer)
  - [Passthrough recovery](#passthrough-recovery)
  - [Passthrough tips](#passthrough-tips)
      - [IOMMU groups PVE CLI](#iommu-groups-pve-cli)
      - [IOMMU groups PVE GUI](#iommu-groups-pve-gui)
  - [Rescanning disks/volumes](#rescanning-disksvolumes)
  - [Making KSM start sooner](#making-ksm-start-sooner)
  - [Enabling a VM's serial console](#enabling-a-vms-serial-console)
    - [Step 1](#step-1)
    - [Step 2](#step-2)
    - [Step 3](#step-3)
  - [Importing disk images](#importing-disk-images)
  - [Networking](#networking)
    - [Prevent NIC name changes](#prevent-nic-name-changes)
    - [Network testing](#network-testing)
      - [Temporary DHCP](#temporary-dhcp)
      - [Find NIC port](#find-nic-port)
    - [Updating ip](#updating-ip)
    - [Find old network configs](#find-old-network-configs)
  - [GPU passthrough](#gpu-passthrough)
    - [Check Device and drivers](#check-device-and-drivers)
    - [CT](#ct-1)
      - [Nvidia specific](#nvidia-specific)
        - [Prerequisite](#prerequisite)
        - [Setup](#setup)
        - [Verify](#verify)
      - [Generic](#generic)
        - [Check groups](#check-groups)
        - [Add devices](#add-devices)
  - [Install intel drivers/modules](#install-intel-driversmodules)
    - [CT](#ct-2)
  - [Install nvidia drivers/modules via apt](#install-nvidia-driversmodules-via-apt)
    - [Prerequisites](#prerequisites)
    - [Node / VM](#node--vm)
    - [CT](#ct-3)
    - [Verify installation](#verify-installation)
    - [Post install](#post-install)
      - [Enable Persistence Daemon](#enable-persistence-daemon)
  - [Install nvidia drivers/modules via .run file](#install-nvidia-driversmodules-via-run-file)
    - [Links and release notes](#links-and-release-notes)
    - [Download and install the .run file](#download-and-install-the-run-file)
      - [CT](#ct-4)
      - [VM](#vm-1)
      - [Node](#node)
      - [Create and Enable Persistence Daemon](#create-and-enable-persistence-daemon)
  - [Install and configure NVIDIA Container Toolkit](#install-and-configure-nvidia-container-toolkit)
  - [ZFS tips](#zfs-tips)
    - [Check space usage and ratios](#check-space-usage-and-ratios)
    - [Find old ZFS snapshots](#find-old-zfs-snapshots)
    - [Shrink a CT's disk](#shrink-a-cts-disk)
    - [Update ZFS ARC size](#update-zfs-arc-size)
      - [Validate](#validate)
      - [Adapt config](#adapt-config)
      - [Final steps](#final-steps)
  - [Misc tips and scripts](#misc-tips-and-scripts)
    - [Find unused disks/volumes](#find-unused-disksvolumes)
    - [Restore guest configs](#restore-guest-configs)
    - [Monitor disk SMART information](#monitor-disk-smart-information)
    - [Credentials](#credentials)
    - [Monitor swap usage](#monitor-swap-usage)
    - [Check which PCI(e) device a drm device belongs to](#check-which-pcie-device-a-drm-device-belongs-to)
    - [Persistent renderd12\*/card\* or other device names](#persistent-renderd12card-or-other-device-names)
      - [Simple way for GPUs](#simple-way-for-gpus)
      - [Complete example also for other devices](#complete-example-also-for-other-devices)
    - [Check which PCI(e) device a disk belongs to](#check-which-pcie-device-a-disk-belongs-to)
    - [IO debugging](#io-debugging)
      - [General](#general)
        - [IO Delay](#io-delay)
        - [iotop-c](#iotop-c)
        - [iostat](#iostat)
        - [fatrace](#fatrace)
      - [ZFS related](#zfs-related)
        - [Checking ZFS latency stats](#checking-zfs-latency-stats)
        - [Checking ZFS queue stats](#checking-zfs-queue-stats)
        - [Checking ZFS request sizes](#checking-zfs-request-sizes)
    - [Set up no-subscription apt repositories](#set-up-no-subscription-apt-repositories)
      - [GUI](#gui)
      - [CLI](#cli)
        - [PVE 8 / Debian 12](#pve-8--debian-12)
        - [PVE 9 / Debian 13](#pve-9--debian-13)
    - [Fix locales](#fix-locales)
    - [Enable package notifications](#enable-package-notifications)
    - [Filter journal messages](#filter-journal-messages)
    - [FAQ](#faq)
      - [Why not use `local` for guest disks?](#why-not-use-local-for-guest-disks)

## Discard

Using trim/discard with thinly allocated disks (which is the default) gives space back to the storage. This saves space, makes backups faster and is needed for thin provisioning to work as expected. This is not related to the PVE storage being backed by a SSD. Use it whenever the storage is thin provisioned. [For ZFS this still counts even if `Thin Provision` (see note below) is not enabled](https://www.reddit.com/r/Proxmox/comments/1kr98iv/server_2022_disk_discard_option_on_zfs/mtdi7cg/).    
Check `lvs`'s `Data%` column and `zfs list`'s `USED`/`REFER`. You might find it to go down when triggering a trim as explained below.

Also see official docs:    
- https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_trim_discard
- https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_hard_disk_discard
- https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_thin_provisioning
- https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_storage_types

[And this arch Wiki article](https://wiki.archlinux.org/title/Solid_state_drive#TRIM). This is more about the hardware side though.    

If you use ZFS you might also want to enable `Thin Provisioning` in `Datacenter > Storage` for your ZFS storage.  
![image](https://gist.github.com/user-attachments/assets/04330006-1a4b-4aa3-aff4-33f1e9be3471).   
This will only affect newly created disks. [Here's how to apply the setting for already existing disks](https://forum.proxmox.com/threads/storage-usage.167052/#post-776013).   

### CT

Note that this is not needed/supported for virtual disks (Mount Points) on storages of type `ZFS`.

Containers usually cannot call [`fstrim`](https://man.archlinux.org/man/fstrim.8.en) themselves. You can trigger a one time immediate trim via `pct fstrim IDOFCTHERE` on the node.  
I [use a cronjob calling `pct fstrim`](https://forum.proxmox.com/threads/fstrim-doesnt-work-in-containers-any-os-workarounds.54421/#post-278310) (add via `crontab -e`).    
```bash
30 0 * * 0 pct list | awk '/^[0-9]/ {print $1}' | while read ct; do pct fstrim ${ct}; done
```
You can also run the command after `30 0 * * 0 ` manually on the node, of course.    
Alternatively you can select `discard` (`8.3.x`+) as mount option so this happens immediately.

**You do not need to enable this for `pct fstrim` to work**.  
Use the mount option when you want it to be immediate/continuous and the `pct fstrim` cronjob to trigger it on a schedule like it usually works for VMs. I prefer the latter.

![image](https://gist.github.com/user-attachments/assets/1de6263c-28d2-4ab3-92c6-324d3c5f310d)

### VM

You can trigger a one time immediate trim (as root) via `fstrim -av` from inside a VM.  
You can also trigger it from the node side via `qm guest exec` if the VM has the guest agent enabled and configured    
```bash
qm list | grep "running" | awk '/[0-9]/ {print $1}' | while read vm; do echo "Trimming ${vm}"; qm guest exec ${vm} -- fstrim -av; done
```
Most OSs come with a `fstrim.timer` which, by default, does a weekly `fstrim` call.  
You can check with `systemctl status fstrim.timer`. If disabled run `systemctl enable fstrim.timer`.  
To edit it to happen more frequently run `systemctl edit fstrim.timer` and write this.    
```bash
[Timer]
OnCalendar=daily
```

> Some guest operating systems may also require the SSD Emulation flag to be set.
If you would like a drive to be presented to the guest as a solid-state drive rather than a rotational hard disk, you can set the SSD emulation option on that drive. There is no requirement that the underlying storage actually be backed by SSDs; this feature can be used with physical media of any type.

For trim/discard to properly work the disk(s) should have the `Discard` flag set.
![image](https://gist.github.com/user-attachments/assets/6a7fd22f-b848-49ec-b535-bf0e7713b8a4)

If you use the Guest Agent (which you really should) I'd also recommend enabling this under `Options > QEMU Guest Agent`.    
![image](https://gist.github.com/user-attachments/assets/1357a9ad-e22e-46f4-8bf7-a6a449ad13a3)

## Preventing a full storage

When using thin allocation it can be problematic when a storage reaches 100%. For ZFS you may also want to stay below a certain threshold.  
If your storage is already full [see this forum post specific to ZFS](https://forum.proxmox.com/threads/protect-a-pbs-datastore-on-zfs-so-that-it-does-not-become-completely-full.166768/#post-774198).   
 
I use [a modified version of this snippet](https://forum.proxmox.com/threads/solved-you-have-not-turned-on-protection-against-thin-pools-running-out-of-space.91055/#post-547417) to send me a mail if any of my storages reach 75% usage.

```bash
# Storage running out of storage. Percentage escaped due to crontab
*/15 * * * * /usr/sbin/pvesm status 2>&1 | /usr/bin/grep -Ev "disabled|error" | tr -d '\%' | awk '$7 >=75 {print $1,$2,$7}' | column -t
```

Or to check a specific type of storage. LVM-Thin in this case

```bash
*/15 * * * * /usr/sbin/pvesm status 2>&1 | /usr/bin/grep "lvmthin" | grep -Ev "disabled|error" | tr -d '\%' | awk '$7 >=75 {print $1,$2,$7}' | column -t
```

A similar method can be used to check the file system directly for, in this example, at least 100G~ of free space.

```bash
*/15 * * * * df /mnt/backupdirectory | tail -n1 | awk '$4 <=100000000 {print $1,$4,$5}' | column -t
```

### Note about paths

It's generally advised to use the full path to executables in cronjobs (like `/usr/sbin/pvesm`) as `PATH` is different.  
I use this at the top of mine so I don't have to care about that and the job is cleaner.

```bash
SHELL=/bin/bash
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
```

## Useful installer shortcuts/tips

Inside the PVE/PBS installer you can use the following shortcuts.  
The terminal is particularily useful in case you need a live environment or do some pre-install customizations.

| Shortcut      | Info           |
| ------------- | -------------- |
| `CTRL+ALT+F1` | Installer      |
| `CTRL+ALT+F2` | Logs           |
| `CTRL+ALT+F3` | Terminal/Shell |
| `CTRL+ALT+F4` | Installer GUI  |

If you press `E` (see below) you can add args that will be persisted into the installed system.

## Temporary kernel arguments

When pressing `E` during boot/install when the OS/kernel selection shows up you can temporarily edit the kernel arguments. This is useful to debug things or disable passthrough if you run into an issue.  
Also see here: <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#nomodeset_kernel_param>

| Argument                          | Info                                                          |
| --------------------------------- | ------------------------------------------------------------- |
| `nomodeset`                       | Helps with hangs during boot/install. Nvidia often needs that |
| `debug`                           | Debugging messages                                            |
| `fsck.mode=force`                 | Triggers a file system check                                  |
| `systemd.mask=pve-guests.service` | Prevents guests from starting up                              |

### Examples of how that selection can look like for PVE with grub/systemd-boot and PBS installer

![image](https://gist.github.com/user-attachments/assets/814c82fb-94d4-4973-8e1a-c3dae689c137)  
![image](https://gist.github.com/user-attachments/assets/03dfb8eb-2a67-4a18-91ee-1edac76b2a84)

![image](https://gist.github.com/user-attachments/assets/8046d78b-36c5-4347-ba13-5c62a06b2cb0)
![image](https://gist.github.com/user-attachments/assets/4e803185-64a1-4f6e-a84c-76d4a7d0941f)
![image](https://gist.github.com/user-attachments/assets/7e3b3c9b-c92f-476c-9a93-781f79c23345)
![image](https://gist.github.com/user-attachments/assets/e3da6df8-62d6-4062-b453-e882aa536393)
![image](https://gist.github.com/user-attachments/assets/3565fdf6-84d1-4315-9ee4-bca123167219)

## Passthrough recovery

When passing through devices it can sometimes happen that your device shares an IOMMU group with something else that's important.  
It's also possible that groups shift if you exchange a device. All of this can cause a system to become unbootable.  
If [editing the boot arguments](#temporary-kernel-arguments) doesn't help, the simplest fix is to go into the UEFI/BIOS and disable every virtualization related thing. VT-x/VT-d/SVM/ACS/IOMMU or whatever it's called for you.

## Passthrough tips

For checking IOMMU groups I like this script: <https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF#Ensuring_that_the_groups_are_valid>.  
For your convenience

```bash
{
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
    echo "IOMMU Group ${g##*/}:"
    for d in $g/devices/*; do
        echo -e "\t$(lspci -nns ${d##*/})"
    done;
done;
}
```

To check the `hostpci` settings of existing VMs (to find which ones use passthrough) you can do this    
```bash
grep -sR "hostpci" /etc/pve
```

#### IOMMU groups PVE CLI

Simple one liner
```bash
lspci -vv | grep -P "\d:\d.*|IOMMU"
```

If you want to use PVE tooling to check the IOMMU groups you can use this    
```bash
pvesh get /nodes/$(hostname)/hardware/pci --pci-class-blacklist ""
```

#### IOMMU groups PVE GUI

To check the IOMMU groups in the GUI you can use the `Hardware` tab of the VM when adding a PCI(e) device.
![image](https://gist.github.com/user-attachments/assets/23d54674-59bb-4eea-be98-cd6e45874740)

Or you can check in `Datacenter > Resource Mappings` which I think is easier to read because of its tree structure.    
**It also warns about IOMMU groups**.    
![image](https://gist.github.com/user-attachments/assets/aa5b8d15-ec2d-4ad0-ba90-691f0a71f988)

## Rescanning disks/volumes

`pct rescan` and `qm rescan` can be useful to find missing volumes and add them to their respective VM/CT.  
You can find them as unused disks in `Hardware`/`Resources`.

## Making KSM start sooner

KSM and ballooning both start when the host reaches 80% memory usage by default.    
Ballooning was hardcoded before version `8.4` but it is now configurable via `node > System > Options > RAM usage target for ballooning`.    
To make KSM start sooner and give it a chance to "free" some memory before ballooning starts you can modify `/etc/ksmtuned.conf`.  
For example to let it start at `70%` you can configure it like this    

```bash
KSM_THRES_COEF=30
```

You can also make it more "aggressive" with something like this

```bash
KSM_NPAGES_MAX=5000
```
Also see official docs:    
- <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#ballooning-target>
- <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_memory>
- <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#kernel_samepage_merging>
- <https://pve.proxmox.com/wiki/Dynamic_Memory_Management>
- <https://pve.proxmox.com/wiki/Kernel_Samepage_Merging_(KSM)>

## Enabling a VM's serial console

This allows you to use xterm.js (used for CTs by default) which allows copy & pasting. Tested for debian/ubuntu.  
All commands are to be run inside the VM and this might also work for other OSs. Please let me know if it does.

### Step 1

Go to the `Hardware` tab of your VM and add a `Serial Port`.  
![image](https://gist.github.com/user-attachments/assets/0d632c47-789f-4200-a47c-8670a8258b25)

### Step 2
Some distributions are already set up for this or can be configured via their own UI and this step can be skipped for them.    
For example Home Assistant's HAOS is already set up for this and [TrueNAS can be configured for it via UI](https://www.truenas.com/docs/scale/scaletutorials/systemsettings/advanced/manageconsolescale/).    

Either one of these commands can help finding the right tty.    

```bash
dmesg -T | grep "tty"
journalctl -b0 -kg "tty"
```

For example it's `ttyS0` for me
```
Aug 18 02:17:16 nodename kernel: 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
```

To enable the TTY edit `/etc/default/grub` via
```bash
nano /etc/default/grub
```

Find the line starting with `GRUB_CMDLINE_LINUX_DEFAULT` and add `console=ttyS0 console=tty0` at the end (replace `ttyS0` with yours from above).    

It can look like this for example
```bash
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0 console=tty0"
```

Save via `CTRL+X` and exit. Afterwards run
```bash
update-grub
```

See here for more:
 - https://0pointer.de/blog/projects/serial-console.html
 - https://docs.kernel.org/admin-guide/serial-console.html
 - https://www.kernel.org/doc/html/v4.14/admin-guide/kernel-parameters.html

### Step 3
Reboot the VM via the PVE button or power it off and on again to apply the Hardware and bootloader config change.  
This is so the VM is cold booted. A normal `reboot` command from within the VM will not do the same.  
You can see if a `Hardware` change was applied by the color. If it's orange it's still to be applied.    

Once that's done your VM should have a functioning `xterm.js` button under the `Console` one. Click the arrow beside it.  
![image](https://gist.github.com/user-attachments/assets/ebc4a15d-0980-4a5d-9401-367088873331)

## Importing disk images
You don't have to use the CLI via `qm disk import`, you can also use the GUI to import disk images or whole machines.    
This assumes you use the `local` storage. Replace with whatever `Directory` storage you want to use.

1. Go to `Datacenter > Storage` and modify `local` to have the `Import` content type.  
   ![image](https://gist.github.com/user-attachments/assets/1c9150af-7ab5-4025-8ee2-eb7e8c21a89b)
2. Go to `local > Import` and use the buttons at the top to upload/download/import your OVA/QCOW2/RAW/VMDK/IMG.  
   ![image](https://gist.github.com/user-attachments/assets/abdda5c9-fcd3-4dbd-87e9-427b9a4f99e7)
3. Select your  file and click the `Import` button at the top.

If you already have a VM you can import the disk via `Hardware > Add > Import Hard Disk` like this    
<img width="286" height="94" alt="image" src="https://gist.github.com/user-attachments/assets/8e3e5f51-cb00-402a-952c-93826c2bec67" />

When creating a new VM (from a OVA in this example) you can delete the existing disk and select `Import` to use it
<img width="719" height="537" alt="image" src="https://gist.github.com/user-attachments/assets/e202abd1-b52f-4540-8daf-8a30074c60d4" />
<img width="729" height="544" alt="image" src="https://gist.github.com/user-attachments/assets/f26fd87b-8d57-42fd-a25b-20312bed5337" />

When importing a machine I recommend to change the following settings. At least for linux guests.

- `OS Type > Linux`
- `Advanced > Disks > SCSI Controller > VirtIO SCSI single`.
- `Advanced > Network Interfaces > Model > VirtIO (paravirtualized)`

## Networking

### Prevent NIC name changes

[A NIC's (Network Interface Controller/Card) name is hardware dependent](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#systemd_network_interface_names) and can change when you add or remove PCI(e) devices. Sometimes major kernel upgrades can also cause this.  
Since the `/etc/network/interfaces` file which handles networking uses these names to configure your network, changes to the name will break it.  
To prevent those changes you can [use a systemd `.link` file to permanently override the name](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#network_override_device_names).    
[PVE 9 comes with the `pve-network-interface-pinning` pinning tool](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_using_the_pve_network_interface_pinning_tool).

### Network testing

You can show your NICs and assigned ips with `ip a`/`ip l`.  
If you have a PCI(e) NIC you can use this to show the device(s) and their modules/drivers.

```bash
lspci -vnnk | awk '/Ethernet/{print $0}' RS= | grep -Pi --color "^|(?<=Kernel driver in use: |Kernel modules: )[^ ]+"
```

If you have a USB NIC you can use

```bash
lsusb -vt | grep -Pi --color "^|(?<=Driver=)[^,]+"
```

Just skip the `| grep ...` with the poor regexes if you don't need to color the output.

This shows the driver used for each NIC. This is useful because it shows the actual name like `eno1`.    
```bash
# ls -l /sys/class/net/*/device/driver
lrwxrwxrwx 1 root root 0 May 15 12:58 /sys/class/net/enp6s0/device/driver -> ../../../../../../bus/pci/drivers/igb
lrwxrwxrwx 1 root root 0 May 15 12:58 /sys/class/net/enp7s0/device/driver -> ../../../../../../bus/pci/drivers/igb
lrwxrwxrwx 1 root root 0 May 15 12:58 /sys/class/net/enx00e04c680085/device/driver -> ../../../../../../../bus/usb/drivers/r8152
```

This shows the device path it belongs to.    
Note the values before and after the `->`. In this example `06:00.0` and and `07:00.0`. `enx00e889680195` is a USB device.      
You can then cross-reference them with the first column of the `lspci | grep -i "Ethernet"` or `lsusb -vt` output    
```bash
# ls -l /sys/class/net/*/device
lrwxrwxrwx 1 root root 0 Jun 24 12:32 /sys/class/net/enp6s0/device -> ../../../0000:06:00.0
lrwxrwxrwx 1 root root 0 Jun 24 12:32 /sys/class/net/enp7s0/device -> ../../../0000:07:00.0
lrwxrwxrwx 1 root root 0 Jun 24 12:32 /sys/class/net/enx00e889680195/device -> ../../../4-1:1.0
```
   
#### Temporary DHCP
To temporarily use DHCP you can use this. The first command gets a lease and the second restores the original configuration again.    

```bash
# For PVE 8 / Debian 12
ifdown vmbr0; dhclient -v
dhclient -r; ifup vmbr0

# For PVE 9 / Debian 13
ifdown vmbr0; dhcpcd -d
dhcpcd -k; ifup vmbr0
```

Optionally pass the NIC name as argument to `dhclient`/`dhcpcd` to test a specific one.  
This is useful to check general router connectivity or what the subnet/gateway is.  
It also allows you to check if your DHCP reservation is properly set up.


#### Find NIC port
To see which port a network cable is plugged into you can unplug it, run `dmesg -Tw` to follow the kernel logs and then plug it in again.  
Use `CTRL+C` to stop following the kernel log.

The classic to make the LED blink

```bash
# NIC from "DHCPREQUEST for x.x.x.x on NIC_NAME_HERE to x.x.x.yx port 67"
ethtool --identify NIC_NAME_HERE
```

Not really helpful if you have no network though as `ethtool` is not pre-installed.

### Updating ip

There's multiple ways (GUI or CLI) and multiple files to edit.  
You need to edit these files

- `/etc/network/interfaces` (`node > System > Network` in the GUI)
- `/etc/hosts` (`node > System > Hosts` in the GUI)
- `/etc/resolv.conf` (`node > System > DNS` in the GUI)
- `/etc/issue` (What you see when loggin in. Just informational but still a good idea to update it)
- `/etc/pve/corosync.conf` (When in a cluster. `config_version ` needs to be incremented when you change things).

I recommend doing `grep -sR "old.ip.here" /etc` to check if you missed something.
Calling `pvebanner`, restarting the `pvebanner` service or rebooting should update the `/etc/issue` as well. Do this last.  
To "reload" `/etc/network/interfaces` and apply the new ip you can do something like `ifreload -av` or simply reboot.


### Find old network configs
`ifupdown2` keeps old `interfaces` files in `/var/log/ifupdown2/`. You can find them like this    
```bash
find /var/log/ifupdown2/ -name "interfaces" -ls
```


## GPU passthrough
This will likely never be a complete tutorial, just some often shared commands and tips and scripts.  
Consult the following sources for instructions and [use mapped devices](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#resource_mapping) rather than raw ones.

- <https://pve.proxmox.com/wiki/PCI(e)_Passthrough>
- <https://pve.proxmox.com/wiki/PCI_Passthrough>
- <https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF>

Make sure to to check the IOMMU groups before passing a device to a VM. [See above](#passthrough-tips).


### Check Device and drivers

Make sure you can see the device and that it uses the expected driver. I.e `nvidia`, `amdgpu`, `i915`, etc.

```bash
lspci -vnnk | awk '/VGA/{print $0}' RS= | grep -Pi --color "^|(?<=Kernel driver in use: |Kernel modules: )[^ ]+"
```

If nvidia devices are not available when the system boots you can work around it by adding this to your crontab via `crontab -e`
```bash
@reboot /usr/bin/nvidia-smi > /dev/null
```


### CT
Create and start your CT before continuing.    

#### Nvidia specific
For NVIDIA you can use the [Nvidia Container Toolkit](#install-and-configure-nvidia-container-toolkit) way.    
The benefits of that are that you do not have to install drivers inside the CT, don't have to add `dev`ices or check groups, etc.    
This also helps with changing render device names (multi GPU) and so on. You will also not have the problem of different driver versions conflicts on upgrades.  
It's very simple and convenient and my recommended way to do this for NVIDIA GPUs.  

##### Prerequisite
Install the NVIDIA drivers [via `apt` (recommended)](https://gist.github.com/Impact123/3dbd7e0ddaf47c5539708a9cbcaab9e3#install-nvidia-driversmodules-via-apt) or [via `.run` file](https://gist.github.com/Impact123/3dbd7e0ddaf47c5539708a9cbcaab9e3#install-nvidia-driversmodules-via-run-file) and the [Nvidia Container Toolkit](#install-and-configure-nvidia-container-toolkit) on the node.    

##### Setup
Set a variable with a list of your CT IDs you want to configure. `pct list` shows them. In this example it's CT `400` and `55`.
```bash
CTIDS=(400 55)
```

Then simply copy and paste this into the node's CLI. This will prepend the needed lines into the CT's config file and reboot it.
```bash
{
for ct in $(pct list | awk '/^[0-9]/ {print $1}'); do
    if [[ ! "${CTIDS[@]}" =~ "$ct" ]]; then
      continue
    fi

    echo "# $ct"
  
    if grep -q "/usr/share/lxc/hooks/nvidia" "/etc/pve/lxc/${ct}.conf"; then
         echo "Already configured"
    else
        {
            echo "lxc.hook.pre-start: sh -c '[ ! -f /dev/nvidia0 ] && /usr/bin/nvidia-modprobe -c0 -u'"
            echo "lxc.environment: NVIDIA_VISIBLE_DEVICES=all"
            echo "lxc.environment: NVIDIA_DRIVER_CAPABILITIES=all"
            echo "lxc.hook.mount: /usr/share/lxc/hooks/nvidia"
            cat /etc/pve/lxc/${ct}.conf
        } > /etc/pve/lxc/${ct}.conf.new && mv /etc/pve/lxc/${ct}.conf.new /etc/pve/lxc/${ct}.conf

        echo "Configured"

        echo "pct reboot $ct"
        pct reboot "$ct"
    fi
done
}
```

##### Verify
If everything was done correctly, running `nvidia-smi` inside the CT should work.


#### Generic

##### Check groups

Check the `video` and `render` group ids inside the CT (from the node side). This is important later.    
The default ones below should work for debian.    

First we define which CTIDs we want to work with
```bash
# CT IDs to check the groups for
CTIDS=(5555 2222 55)
```

Then we check the video and render groups of the CTs with those IDs
```bash
for id in ${CTIDS[@]}; do
    echo "# $id"
    pct exec $id getent group video render | awk -F: '{print $1,$3}'
    echo ""
done
```

##### Add devices
This procedure simply calls `pct set IDOFCTHERE --devX /givenpath` for all the given paths and reboots the CT.    
It handles the optional gids (for the `video` and `render` groups) when given.    
Modify it to add more devices and change the gids. Invalid paths and CTs will be skipped so there's no need to remove anything you don't have.   

First we define which CTIDs we want to work with and which devices to pass to them
```bash
# CT IDs to add the devices to
CTIDS=(5555 2222 55)
```

Also see [Check which PCI(e) device a drm device belongs to](#check-which-pcie-device-a-drm-device-belongs-to).    
```bash
# Devices to add to the CT(s)
DEVICES=(
  "/dev/dri/renderD128,gid=104"
  "/dev/dri/renderD129,gid=104"
  "/dev/dri/renderD130,gid=104"
  "/dev/dri/renderD131,gid=104"
  "/dev/dri/card0,gid=44"
  "/dev/dri/card1,gid=44"
  "/dev/dri/card2,gid=44"
  "/dev/dri/card3,gid=44"
  "/dev/kfd,gid=104"
  "/dev/nvidia0"
  "/dev/nvidia1"
  "/dev/nvidia2"
  "/dev/nvidia3"
  "/dev/nvidiactl"
  "/dev/nvidia-uvm"
  "/invalid"
  "/dev/nvidia-uvm-tools"
)
```

Verify and show the group and user IDs for the devices on the node. The IDs/GIDs should match with the CT side above. If not modify them.    
Note: You can run this inside the CT too.    
```bash
{
function showDeviceInfo() {
  echo "user userName group groupName device"
  for device in "${DEVICES[@]}"; do
        trimmedDevice=${device%%,*}

        if [ -e "$trimmedDevice" ]; then
          echo "$(stat -c '%u %U %g %G %n' "$trimmedDevice") $device"
        fi
  done
}

showDeviceInfo | column -t
}
```

Run the rest of the script
```bash
{
for ct in $(pct list | awk '/^[0-9]/ {print $1}'); do
  if [[ ! "${CTIDS[@]}" =~ "$ct" ]]; then
    continue
  fi

  echo "# $ct"

  index=0
  for device in "${DEVICES[@]}"; do
      trimmedDevice=${device%%,*}

      if [ -e "$trimmedDevice" ]; then
        echo "pct set $ct --dev${index} $device"
        pct set "$ct" --dev${index} "$device"
        ((index++))
      fi
  done

  echo "pct reboot $ct"
  pct reboot "$ct"
done
}
```

## Install intel drivers/modules

### CT
Some of these packages can be needed for the intel drivers/modules/tools to work properly inside a CT. For example with jellyfin/frigate.    
This can be a little bit finicky so I stole part of the list from the helper script project.    
```bash
apt install -y va-driver-all ocl-icd-libopencl1 intel-opencl-icd vainfo intel-gpu-tools nvtop
```
Validate with `vainfo`, `intel_gpu_top` and `nvtop`.    


## Install nvidia drivers/modules via apt
~~**Broken with PVE 9.1 due to new kernel**~~. [Use The run file method](#install-nvidia-driversmodules-via-run-file).

[If you have to use PVE 8 or Debian 12 see older version of this guide](https://gist.github.com/Impact123/3dbd7e0ddaf47c5539708a9cbcaab9e3/79c02ab9654ae368a60d9ff23fec147dc59d82c8#install-nvidia-driversmodules-via-apt).    
It's a simpler method as it uses packages straight from the debian repos. They might be a bit older but this should be fine and it makes installation simpler.    
This guide is a bit more opinionated. For example it "forces" you to use the DEB822 format and provides no alternative. Please read the comments for additional hints and options.

Most guides use nvidia's `.run` files but then you have to update the drivers manually. Instead you can use the drivers/libs from the debian apt repository and update them like any other package.       
Note that this has the disadvantage that you, at least by default unless you pin versions, have less control over updates and thus might need to reboot more often. For example when the version of the running driver doesnt match the libraries and tools any more.   

These instructions are based on [the official debian instructions](https://wiki.debian.org/NvidiaGraphicsDrivers)  
I modified them for easy copy pasting. These commands should work for nodes, VMs and CTs as long as they are based on debian. [If you use ubuntu please use their docs](https://documentation.ubuntu.com/server/how-to/graphics/install-nvidia-drivers/).     
This assumes you use the `root` user. **These command are to be run on the node/VM/CT. Copy & paste.**

### Prerequisites
We need the `non-free` component. You should be able to run this to add the component to your [`/etc/apt/sources.list.d/debian.sources`](https://wiki.debian.org/SourcesList) file and update the lists
```bash
# Rewrites apt *.list files to *.sources in DEB822 format
apt modernize-sources

# Optional to delete the backup files of the modernize tool above
find /etc/apt/sources.list.d/ -type f -name "*.bak" -delete

# Rewrites the "Components:" line to add non-free and non-free-firmware
sed -i 's/^Components: .*/Components: main contrib non-free non-free-firmware/' /etc/apt/sources.list.d/debian.sources

# Updates the lists
apt update
```

**If your node/VM uses Secure Boot** (check with `mokutil --sb-state`) follow this section.    
**Make sure to monitor the next boot process via noVNC**. You will be asked for the password when importing the key.    
```bash
apt install dkms && dkms generate_mok

dpkg -s proxmox-ve 2>&1 > /dev/null && apt install proxmox-default-headers || apt install linux-headers-generic

# Set a simple password (a-z keys)
mokutil --import /var/lib/dkms/mok.pub

# If you followed this section after you already installed the driver run this and reboot
# dpkg-reconfigure nvidia-kernel-dkms
```

### Node / VM
```bash
apt install nvidia-detect

# Will likely recommend "nvidia-driver"
nvidia-detect

# "nvidia-smi" and "nvtop" are optional but recommended
apt install nvidia-driver nvidia-smi nvtop
```

### CT
Here we just need the libraries so `nvidia-driver` is replaced with `nvidia-driver-libs`.
```bash
# "nvidia-smi" and "nvtop" are optional but recommended
apt install nvidia-driver-libs nvidia-smi nvtop
```


### Verify installation
Now see if `nvidia-smi` works. A reboot might be necessary for the node or a VM.


### Post install
#### Enable Persistence Daemon
This can help save power and decrease access delays. [See docs](https://download.nvidia.com/XFree86/Linux-x86_64/396.51/README/nvidia-persistenced.html).   
**These commands are to be run on the node or VM. Copy & paste.**    
Enable and start it with
```bash
systemctl enable --now nvidia-persistenced.service
```
You can see the status in `nvidia-smi`.   
![image](https://gist.github.com/user-attachments/assets/e92eb823-470b-43f4-8e02-d962e749b27c)


## Install nvidia drivers/modules via .run file
This is my current recommendation for **PVE 9 / Debian 13**. 

This alternative to the apt installation method gives you more control over the version but you have to update yourself.   
These commands should work for both the nodes, VMs and CTs as long as they are based on debian/ubuntu.  
This assumes you use the `root` user. **These command are to be run on the node/VM/CT. Copy & paste.**     

### Links and release notes
For datacenter (Some links are broken but you can google for the version)
- <https://developer.nvidia.com/datacenter-driver-archive>
- <https://docs.nvidia.com/datacenter/tesla/index.html>

For linux/unix
- <https://www.nvidia.com/en-us/drivers/unix/linux-amd64-display-archive/>
- <https://www.nvidia.com/en-us/drivers/unix/>

### Download and install the .run file
`<TAB>` here means pressing the `TAB` key to auto complete the file name.    

#### CT
```bash
wget LINKFROMABOVEHERE
chmod +x NVIDIA*.run
# Adjust if necessary. Add -q to skip questions
./$(ls -t NVIDIA*.run | head -n 1) --no-kernel-modules
```

#### VM
```bash
apt install -y linux-headers-generic gcc make dkms 
wget LINKFROMABOVEHERE
chmod +x NVIDIA*.run
# Adjust if necessary. Add -q to skip questions
./$(ls -t NVIDIA*.run | head -n 1) --dkms
```

#### Node
```bash
apt install -y proxmox-default-headers gcc make dkms
wget LINKFROMABOVEHERE
chmod +x NVIDIA*.run
# Adjust if necessary. Add -q to skip questions
./$(ls -t NVIDIA*.run | head -n 1) --dkms --disable-nouveau --kernel-module-type proprietary --no-install-libglvnd
```

#### Create and Enable Persistence Daemon
Also [see above](#enable-persistence-daemon)
```bash
cat <<EOF > /etc/systemd/system/nvidia-persistenced.service
[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target

[Service]
Type=forking
ExecStart=/usr/bin/nvidia-persistenced --user nvpd
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable --now nvidia-persistenced.service
```


## Install and configure NVIDIA Container Toolkit
**These commands are to be run inside a CT or on the node. Copy & paste.**    
Install this on the node if you want to give a NVIDIA GPU to a CT and install it in the CT if you want to give a passed through GPU to a docker container.    
Adapted from [the official guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).

```bash
{
apt update && apt install -y gpg curl --no-install-recommends
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor > /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

rm -f /etc/apt/sources.list.d/nvidia-container-toolkit.list
cat <<EOF > /etc/apt/sources.list.d/nvidia-container-toolkit.sources
Types: deb
URIs: http://nvidia.github.io/libnvidia-container/stable/deb/amd64/
Suites: /
Components:
Signed-By: /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
EOF

apt update && apt install -y nvidia-container-toolkit

systemctl status docker.service >/dev/null 2>&1 && nvidia-ctk runtime configure --runtime=docker

# This is needed for LXC or you might get an error like
# nvidia-container-cli: mount error: failed to add device rules: unable to find any existing
# device filters attached to the cgroup: # bpf_prog_query(BPF_CGROUP_DEVICE) failed: operation
# not permitted: unknown
if [[ $(systemd-detect-virt) == "lxc" ]]; then
  nvidia-ctk config -i --set nvidia-container-cli.no-cgroups=true
fi

systemctl status docker.service >/dev/null 2>&1 && systemctl restart docker.service
}
```

If you installed this to run docker containers you can verify if it worked like this

```bash
docker run --rm --gpus all ubuntu nvidia-smi
```

## ZFS tips

### Check space usage and ratios

This sorts by compression ratio

```bash
zfs list -ospace,logicalused,compression,compressratio -rS compressratio
```

This sorts by used size
```bash
zfs list -ospace,logicalused,compression,compressratio -rS used
```

### Find old ZFS snapshots
If above shows `USEDSNAP` being very high and you already deleted snapshots or have none it might be from a old/broken migration task.    
It might make sense to add a ` | less` at the end if you have lots of snapshots.
```bash
zfs list -ospace,logicalused,compression,compressratio,creation -rs creation -t snap
```

### Shrink a CT's disk
Since CTs use datasets this is very trivial and should be reasonably safe but make sure to take backups.    
First grab some information about the CT (ID 120 in this example) you want to modify

```bash
# zfs list -ospace,logicalused,refquota | grep -E "NAME|120"
NAME                       AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD  LUSED  REFQUOTA
nvmezfs/subvol-120-disk-0  7.02G  23.0G      160K   23.0G             0B         0B  28.3G       30G
```

Take note of `USED` and then simply set the `refquota` to what you want. Don't set the quota too low or lower than `USED`.      
```bash
zfs set refquota=29G nvmezfs/subvol-120-disk-0
```

Lastly run a `pct rescan`
```bash
# pct rescan
rescan volumes...
CT 120: updated volume size of 'nvmezfs:subvol-120-disk-0' in config.
```

This works for growing it too, but the GUI already provides that option.

### Update ZFS ARC size

Adapted from [the official documentation](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_zfs_limit_memory_usage)  
PVE uses 10% of the host's memory by default but it only configures the system like that if the OS was installed on ZFS.  
If you configure a ZFS storage after installation the defaults of 50% will be used which you probably don't want.    
This will change soon: <https://bugzilla.proxmox.com/show_bug.cgi?id=6285>.    
[PVE 9 / ZFS `2.3.x` removes the 50% limit on linux](https://github.com/openzfs/zfs/pull/15437).    

#### Validate

Check the current ARC size with

```bash
arc_summary -s arc

# Also helpful
arcstat

# To check hit ratios
arc_summary -s archits
```

Check the config file (which might not exist) with

```bash
cat /etc/modprobe.d/zfs.conf
```

#### Adapt config

The code below will try to not replace your file but only update it.    

To calculate a percentage of your total memory in G you can use this

```bash
PERCENTAGE=10
grep MemTotal /proc/meminfo | awk -v percentage=$PERCENTAGE '{print int(($2 / 1024^2) / 100 * percentage)}'
```

Set the size in G to adapt to with this

```bash
ARC_SIZE_G=32
```

Then let the code below do the rest

```bash
{
MEMTOTAL_BYTES="$(($(awk '/MemTotal/ {print $2}' /proc/meminfo) * 1024))"

ARC_SIZE_BYTES_MIN="$(( MEMTOTAL_BYTES / 32 ))"
ARC_SIZE_BYTES_MAX=$(( ARC_SIZE_G * 1024*1024*1024 ))

if [ "$ARC_SIZE_BYTES_MAX" -lt "$ARC_SIZE_BYTES_MIN" ]; then
    echo "Error: Given ARC Size of ${ARC_SIZE_BYTES_MAX} is lower than the current default minimum of ${ARC_SIZE_BYTES_MIN}. Please increase it."
    :
elif [ "$ARC_SIZE_BYTES_MAX" -gt "$MEMTOTAL_BYTES" ]; then
    echo "Error: Given ARC Size of ${ARC_SIZE_BYTES_MAX} is greater than the total memory of ${MEMTOTAL_BYTES}. Please decrease it."
    :
fi

echo "$ARC_SIZE_BYTES_MAX" > /sys/module/zfs/parameters/zfs_arc_max

if grep -q "options zfs zfs_arc_max" "/etc/modprobe.d/zfs.conf" 2> /dev/null; then
    sed -ri "s/.*options zfs zfs_arc_max.*/options zfs zfs_arc_max=$ARC_SIZE_BYTES_MAX # ${ARC_SIZE_G}G/gm" /etc/modprobe.d/zfs.conf
else
    echo -e "options zfs zfs_arc_max=$ARC_SIZE_BYTES_MAX # ${ARC_SIZE_G}G" >> /etc/modprobe.d/zfs.conf
fi
}
```

#### Final steps

Check the config and ARC again to see if everything looks alright, then finally update the initramfs. This is needed so the settings are persisted.

```bash
# -k all might not be needed and omitting it speeds up the process
update-initramfs -u -k all
```

There is no reboot necessary.

## Misc tips and scripts
Just some miscellaneous small tips and scripts which don't have a good place yet or are better to be linked from above to keep things structured and organized.    

### Find unused disks/volumes
If goes without saying that you should be careful here. I trust you have backups.    

First rescan
```bash
qm rescan
pct rescan
```

Now find unused disks in the configs
```bash
# grep -sR "^unused[0-9]+: " /etc/pve/
/etc/pve/nodes/pve/qemu-server/500.conf:unused0: nvmezfs:vm-500-disk-1
```

Investigate their source
```bash
# pvesm path nvmezfs:vm-500-disk-1
/dev/zvol/nvmezfs/vm-500-disk-1
```

Show all of their paths
```bash
grep -sR "^unused[0-9]+: " /etc/pve/ | awk -F': ' '{print $2}' | xargs -I{} pvesm path {}
```

Then delete if needed
```bash
# qm set 500 --delete unused0
```

Here's a little script to do all of this for you. It only tells you the commands, not run them.    
```bash
{
find /etc/pve/ -name '[0-9]*.conf' | while read -r config; do
    [[ "$config" == *"/lxc/"* ]] && CMD="pct" || CMD="qm"

    guest=$(basename "$config" .conf)
    unused_lines=$(grep -E '^unused[0-9]+: ' "$config") || continue

    echo "$unused_lines" | while read -r line; do
        echo "# $line"
        disk=$(echo "$line" | awk -F':' '{print $1}')
        echo -e  "$CMD set $guest --delete $disk\n"
    done
done
}
```

### Restore guest configs

A script that can extract the `.conf` file out of [`pmxcfs`](<https://pve.proxmox.com/wiki/Proxmox_Cluster_File_System_(pmxcfs)>)'s `config.db`.  
Only lightly tested and written without a lot of checks so be careful. Make a backup of the file and install `sqlite3` with `apt install sqlite3`.

```bash
#!/usr/bin/env bash
# Attempts to restore .conf files from a PMXCFS config.db file.
set -euo pipefail

# Usually at /var/lib/pve-cluster/config.db
# You can do "cd /var/lib/pve-cluster/" and leave CONFIG_FILE as is
CONFIG_FILE="config.db"

# Using these paths can be convenient but dangerous!
# /etc/pve/nodes/$(hostname)/qemu-server/
VM_RESTORE_PATH="vms"

# /etc/pve/nodes/$(hostname)/lxc/
CT_RESTORE_PATH="cts"

[ -d "$VM_RESTORE_PATH" ] || mkdir "$VM_RESTORE_PATH"
[ -d "$CT_RESTORE_PATH" ] || mkdir "$CT_RESTORE_PATH"

GUESTIDS=$(sqlite3 $CONFIG_FILE "select name from tree where name like '%.conf' and name != 'corosync.conf';")

for guest in $GUESTIDS; do
    sqlite3 $CONFIG_FILE "select data from tree where name like '$guest';" >"$guest"

    if grep -q "rootfs" "$guest"; then
        mv "$guest" "$CT_RESTORE_PATH"
        echo "Restored CT config $guest to $VM_RESTORE_PATH/$guest"
    else
        mv "$guest" "$VM_RESTORE_PATH"
        echo "Restored VM config $guest to $CT_RESTORE_PATH/$guest"
    fi
done
```

### Monitor disk SMART information

You can monitor all your disks' SMART info like this. This creates a nice "table" and highlights changes.

Temperature

```bash
watch -x -c -d -n1 bash -c 'for i in /dev/{nvme[0-9]n1,sd[a-z]}; do echo -e "\n[$i]"; smartctl -a $i | grep -Ei "Device Model|Model Number|Serial|temperature"; done'
```

Errors

```bash

watch -x -c -d -n1 bash -c 'for i in /dev/{nvme[0-9]n1,sd[a-z]}; do echo -e "\n[$i]"; smartctl -a $i | grep -Ei "Device Model|Model Number|Serial|error"; done'

```

Temperature and writes

```bash
watch -x -c -d -n1 bash -c 'for i in /dev/{nvme[0-9]n1,sd[a-z]}; do echo -e "\n[$i]"; smartctl -a $i | grep -Ei "Device Model|Model Number|Serial|temperature|writ"; done'
```

and so on.

### Credentials
PVE keeps credentials like CIFS passwords in `/etc/pve/priv/storage`.


### Monitor swap usage
```bash
apt install smem --no-install-suggests --no-install-recommends
# -a, --autosize        size columns to fit terminal size
# -t, --totals          show totals
# -k, --abbreviate      show unit suffixes
# -r, --reverse         reverse sort
# -s SORT, --sort=SORT  field to sort on
watch -n1 'smem -atkr -s swap'
```

### Check which PCI(e) device a drm device belongs to
If you have multiple GPUs you will likely have multiple `/dev/dri/card*` and `/dev/dri/renderD*` devices.    
Note the values before and after the `->`. In this example `01:00.0`, `05:00.0` and `09:00.0`    
```bash
# ls -l /sys/class/drm/*/device
lrwxrwxrwx 1 root root 0 May 17 07:54 /sys/class/drm/card0/device -> ../../../0000:05:00.0
lrwxrwxrwx 1 root root 0 May 17 07:54 /sys/class/drm/card1/device -> ../../../0000:09:00.0
lrwxrwxrwx 1 root root 0 May 17 07:54 /sys/class/drm/card2/device -> ../../../0000:01:00.0
lrwxrwxrwx 1 root root 0 May 17 07:54 /sys/class/drm/renderD128/device -> ../../../0000:09:00.0
lrwxrwxrwx 1 root root 0 May 17 07:54 /sys/class/drm/renderD129/device -> ../../../0000:01:00.0
```

You can then cross-reference them with the first column of `lspci | grep -i "VGA"`
```bash
# lspci | grep -i "VGA"
01:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)
05:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c8)
```

### Persistent renderd12*/card* or other device names
**This doesn't correctly work at this time**. I'll leave it here because it might be useful for other things.

Device paths such as `/dev/dri/renderD128` and `/dev/dri/card0` can change their name across boots similar to `/dev/sdX` for disks.    
We can use udev rules to create a symlink that will refer to the right device. [Also see the Arch Wiki article about UDEV](https://wiki.archlinux.org/title/Udev).    
[Check which PCIe device a DRM device belongs to first](#check-which-pcie-device-a-drm-device-belongs-to).    
Also see [Check device and drivers](#check-device-and-drivers) to get the vendor and device ids.    

#### Simple way for GPUs
Create a file in `/erc/udev/rules.d/` via `nano /etc/udev/rules.d/99-gpu-render.rules` and put this in it
```bash
# Render
SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/renderD[0-9]*", KERNEL=="renderD[0-9]*" \
    SYMLINK+="dri/render-$attr{vendor}_$attr{device}-$attr{subsystem_vendor}_$attr{subsystem_device}-$driver_$env{ID_PATH_TAG}"

SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/renderD[0-9]*", KERNEL=="renderD[0-9]*" \
    SYMLINK+="dri/render-$attr{vendor}_$attr{device}-$attr{subsystem_vendor}_$attr{subsystem_device}-$driver"

SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/renderD[0-9]*", KERNEL=="renderD[0-9]*" \
    SYMLINK+="dri/render-$driver_$env{ID_PATH_TAG}"

# Video/Card
SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/card[0-9]*", KERNEL=="card[0-9]*" \
    SYMLINK+="dri/card-$attr{vendor}_$attr{device}-$attr{subsystem_vendor}_$attr{subsystem_device}-$driver_$env{ID_PATH_TAG}", GROUP="video"

SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/card[0-9]*", KERNEL=="card[0-9]*" \
    SYMLINK+="dri/card-$attr{vendor}_$attr{device}-$attr{subsystem_vendor}_$attr{subsystem_device}-$driver", GROUP="video"

SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/card[0-9]*", KERNEL=="card[0-9]*" \
    SYMLINK+="dri/card-$driver_$env{ID_PATH_TAG}", GROUP="video"
```
Then reload and trigger udev
```bash
udevadm control --reload-rules  && udevadm trigger --subsystem-match=drm
```
Now check `ls -l /dev/dri/`. This rules file should have dynamically created links like this in `/dev/dri/`    
```bash
render-nvidia_pci-0000_01_00_0
render-0x10de_0x2204-0x1043_0x87b3-nvidia
render-0x10de_0x2204-0x1043_0x87b3-nvidia_pci-0000_01_00_0

card-nvidia_pci-0000_01_00_0
card-0x10de_0x2204-0x1043_0x87b3-nvidia_pci
card-0x10de_0x2204-0x1043_0x87b3-nvidia_pci-0000_01_00_0
```
This allows you to easily and reliably refer to a specific GPU's device to pass to a CT.    
Note that these links change if the PCI ID does too. The ID is needed to uniquely refer to a device so it's part of the link name.

For NVIDIA I'd generally recommend the NVIDIA toolkit for which this is mostly useless but if you know of a simple way to achieve this for NVIDIA's `card0` devices let me know.

#### Complete example also for other devices
This works for other thing such as USB devices too. In this example I will work with these two GPUs. Take note of the first column.    
```bash
# lspci -nnk | grep -i "VGA"
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102 [GeForce RTX 3090] [10de:2204] (rev a1)
09:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] (rev c8)
```

I have these devices
```bash
# ls -l /sys/class/drm/*/device
lrwxrwxrwx 1 root root 0 Sep 25 22:16 /sys/class/drm/card0/device -> ../../../0000:05:00.0
lrwxrwxrwx 1 root root 0 Sep 25 22:16 /sys/class/drm/card0-VGA-1/device -> ../../card0
lrwxrwxrwx 1 root root 0 Sep 25 22:16 /sys/class/drm/card1/device -> ../../../0000:01:00.0
lrwxrwxrwx 1 root root 0 Sep 25 22:16 /sys/class/drm/card2/device -> ../../../0000:09:00.0
lrwxrwxrwx 1 root root 0 Sep 25 22:16 /sys/class/drm/renderD128/device -> ../../../0000:01:00.0
lrwxrwxrwx 1 root root 0 Sep 25 22:16 /sys/class/drm/renderD129/device -> ../../../0000:09:00.0
```

As you can see `renderD129` points to my iGPU (`09:00.0`) and it's what I use in this example.

Check for uniqe attibutes to target the device
```bash
# udevadm info --attribute-walk --name=/dev/dri/renderD129 | grep -E "SUBSYSTEM|KERNEL|{device}|{vendor}"
    KERNEL=="renderD129"
    SUBSYSTEM=="drm"
    KERNELS=="0000:09:00.0"
    SUBSYSTEMS=="pci"
    ATTRS{device}=="0x1638"
    ATTRS{vendor}=="0x1002"
    KERNELS=="0000:00:08.1"
    SUBSYSTEMS=="pci"
    ATTRS{device}=="0x1635"
    ATTRS{vendor}=="0x1022"
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
```
Here `KERNELS=="0000:09:00.0"`, `ATTRS{device}=="0x1635"` and `ATTRS{vendor}=="0x1022"` match with my iGPU so I'll use that.


Create a file in `/erc/udev/rules.d/` via `nano /etc/udev/rules.d/99-gpu-render.rules`.    
Mine looks like this for both GPUs' devices
```bash
# iGPU
SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/renderD[0-9]*", KERNEL=="renderD[0-9]*", ATTRS{vendor}=="0x1002", ATTRS{device}=="0x1638", \
    SYMLINK+="dri/render-igpu", GROUP="render"

SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/card[0-9]*", KERNEL=="card[0-9]*", ATTRS{vendor}=="0x1002", ATTRS{device}=="0x1638", \
    SYMLINK+="dri/card-igpu", GROUP="video"

# dGPU
SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/renderD[0-9]*", KERNEL=="renderD[0-9]*", ATTRS{vendor}=="0x10de", ATTRS{device}=="0x2204", \
    SYMLINK+="dri/render-dgpu", GROUP="render"

SUBSYSTEM=="drm", ENV{DEVNAME}=="/dev/dri/card[0-9]*", KERNEL=="card[0-9]*", ATTRS{vendor}=="0x10de", ATTRS{device}=="0x2204", \
    SYMLINK+="dri/card-dgpu", GROUP="video"
```
Feel free to use more fitting names. 

Finally reload and trigger udev
```bash
udevadm control --reload-rules  && udevadm trigger --subsystem-match=drm
```

See if the symlinks appear via `ls -l /dev/dri/` and point to the right devices.    
This works the same way for other such devices.


### Check which PCI(e) device a disk belongs to
This is useful if you want to know to which controller a disk is connected to.    
Note the values before and after the `->`. In this example `02:00.1` and `08:00.0`            
```bash
# ls -l /dev/disk/by-path/
lrwxrwxrwx 1 root root  9 Jul  1 18:05 pci-0000:02:00.1-ata-2 -> ../../sda
lrwxrwxrwx 1 root root 13 Jul  1 18:05 pci-0000:08:00.0-nvme-1 -> ../../nvme0n1
```

You can then cross-reference them with the first column of `lspci`
```bash
# lspci
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 500 Series Chipset SATA Controller
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
```
Note that you can't necessarily rely on the name to always refer to the same device. 


### IO debugging
This  section is about how to check which process causes IO Wait/IO Delay, which disk is affected by it, how fast it reads/writes and so on.    
Also see these articles:
- https://www.site24x7.com/learn/linux/troubleshoot-high-io-wait.html
- https://linuxblog.io/what-is-iowait-and-linux-performance/
- https://serverfault.com/questions/367431/what-creates-cpu-i-o-wait-but-no-disk-operations



#### General
Install the dependencies first.
```bash
apt install -y sysstat iotop-c fatrace
```

##### IO Delay
IO delay or IO Wait is shown in the PVE node's `Summary` and `top` can also be used to check the IO wait via its `wa` field in the CPU row.    

![image](https://gist.github.com/user-attachments/assets/fc4757e1-7fa8-423f-a641-5affcb341ccd)    
![image](https://gist.github.com/user-attachments/assets/65c9620b-002b-4c7f-9cc3-39f7f8ca2540)

  
##### iotop-c 
[`iotop-c`](https://manpages.debian.org/trixie/iotop-c/iotop-c.8.en.html) can show per process statistics. For it to properly work (see why below) you should [add the `delayacct` kernel arg](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot_edit_kernel_cmdline) and reboot.    
> Alternatively, use `sysctl kernel.task_delayacct` to switch the state at runtime.    
> Note however that **only tasks started after enabling it will have delayacct information**.  
  
https://docs.kernel.org/accounting/delay-accounting.html#usage

Run this and check the column (select it via arrow keys) you're interested in.      
```bash
# -c, --fullcmdline      show full command line
# -P, --processes        only show processes, not all threads
# -a, --accumulated      show accumulated I/O instead of bandwidt
iotop-c -cP
```    
Also try `iotop-c -cPa` or press `a` to toggle cumulative/summary/total mode and let it run for a while.    
![image](https://gist.github.com/user-attachments/assets/d2995763-1bdb-4cfb-98ca-9b87ae279b8d)

##### iostat   
[`iostat`](https://manpages.debian.org/trixie/sysstat/iostat.1.en.html) can show statistics per disk. Run this and check the `%util` column.    
```bash
# -x         Display extended statistics.
# -y         Omit first report with statistics since system boot.
# -z         Omit output for devices for which there was no activity during the sample period
# -t         Print the time for each report displayed.
# -s         Display a short (narrow) version of the report up to 80 characters.
# --compact  Don't break the Device Utilization Report into sub-reports.
# --human    Print sizes in human readable format (e.g. 1.0k, 1.2M, etc.).
iostat -xyzts --compact --human 1
```    

![image](https://gist.github.com/user-attachments/assets/9759939f-bad1-467e-b61a-dc9cd1d8aa68)

##### fatrace
[`fatrace`](https://manpages.debian.org/stretch/fatrace/fatrace.1.en.html) can be used to check file events such as read, write, create and so on. It can help you identify which processes are modifying files and when. Here's an example to listen for file writes
```bash
# -t, --timestamp               Add timestamp to events. Give twice for seconds since the epoch.
# -f TYPES, --filter=TYPES      Show only the given event types; C, R, O, or W, e. g. --filter=OC
fatrace -tf W
```    

<img width="486" height="350" alt="image" src="https://gist.github.com/user-attachments/assets/725c6b2a-adef-4e97-9215-4b6b380be7f2" />



#### ZFS related
If you use RAID you might also want to use [`zpool iostat`](https://openzfs.github.io/openzfs-docs/man/master/8/zpool-iostat.8.html)'s `v` flag
> Verbose statistics Reports usage statistics for individual vdevs within the pool, in addition to the pool-wide statistics.

##### Checking ZFS latency stats
```bash
# -y      Normally the first line of output reports the statistics since boot: suppress it.
# -l      Include average latency statistics:
watch -cd -n1 "zpool iostat -yl 1 1"
```
![image](https://gist.github.com/user-attachments/assets/7e67c7de-5576-4ae2-a543-252894b6ba1e)

##### Checking ZFS queue stats
```bash
# -q      Include  active  queue  statistics.
watch -cd -n1 "zpool iostat -yq 1 1"
```
![image](https://gist.github.com/user-attachments/assets/53213bbe-e135-418b-b52e-d9d7b68126c5)



##### Checking ZFS request sizes
```bash
# -r      Print request size histograms for the leaf vdev's I/O
watch -cd -n1 "zpool iostat -yr 1 1"
```
![image](https://gist.github.com/user-attachments/assets/97c0de6c-a541-4941-944a-bd9f549f3bc3)


### Set up no-subscription apt repositories
With PVE 9 / Debian 13 the file suffix can now also be `.sources` so don't get confused by that.

Also see official docs:    
 - <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_no_subscription_repo>    
 - <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#repos_secure_apt>

#### GUI
Go to `node > Updates > Repositories` and add the `no-subscription` repo.    
![image](https://gist.github.com/user-attachments/assets/f6386651-f457-4f48-a22f-4bcd245ba899)
![image](https://gist.github.com/user-attachments/assets/51a83cb5-4c57-4bef-869f-5cee21382cc0)  

Disable the enterprise repos    
  
![image](https://gist.github.com/user-attachments/assets/1a29b310-c14e-4d6f-bc18-2ca6d05bbdd3)    
![image](https://gist.github.com/user-attachments/assets/1e355401-ce2e-41ed-ae7f-2fa90e1ce4a0)    

At the end it should look like this.
![image](https://gist.github.com/user-attachments/assets/79883dcb-7d75-4efa-b3a6-3dda631581a9)    

Go to `node > Updates > Refresh` and see if everything works as expected.    


#### CLI

##### PVE 8 / Debian 12
Here's an example `/etc/apt/sources.list` file
```bash
deb http://ftp.debian.org/debian bookworm main contrib
deb http://ftp.debian.org/debian bookworm-updates main contrib

# security updates
deb http://security.debian.org/debian-security bookworm-security main contrib

# Proxmox VE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription
```
To keep the default one and add just the proxmox repo in its own file you can do this
```bash
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list
```

You should disable the default enterprise repos at this point by commenting out the lines
```bash
sed -i '/^#/!s/^/#/' /etc/apt/sources.list.d/pve-enterprise.list
```

Now check with `apt update` for errors.


##### PVE 9 / Debian 13
[You can find an example `/etc/apt/sources.list.d/proxmox.sources` file here](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_no_subscription_repo).   

It looks like this
```bash
Types: deb
URIs: http://download.proxmox.com/debian/pve
Suites: trixie
Components: pve-no-subscription
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
```

To create this file you can use this command
```bash
cat > /etc/apt/sources.list.d/proxmox.sources << EOF
Types: deb
URIs: http://download.proxmox.com/debian/pve
Suites: trixie
Components: pve-no-subscription
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
EOF
```

You might have to download the key it references via
```bash
wget https://enterprise.proxmox.com/debian/proxmox-archive-keyring-trixie.gpg -O /usr/share/keyrings/proxmox-archive-keyring.gpg
```



You should disable the default enterprise repos at this point by commenting out the lines (or appending `Enabled: no`)
```bash
sed -i '/^#/!s/^/#/' /etc/apt/sources.list.d/pve-enterprise.sources
```

Now check with `apt update` for errors.


### Fix locales
Do you have strange characters in your CLI tools rather than unicode symbols? The default `C` locale might be the cause.    
This is mostly useful for CTs. For VMs you generally set this up during install.

To interactive change it you can use
```bash
dpkg-reconfigure locales
```

To non-interactively change it you can use something like this
```bash
echo "en_US.UTF-8 UTF-8" > /etc/locale.gen
echo 'LANG=en_US.UTF-8' > /etc/locale.conf
ln -sf /etc/locale.conf /etc/default/locale
source /etc/locale.conf
locale-gen
```

Verify with these
```bash
locale
localectl
```

### Enable package notifications
PVE is able to send you notifications about updates which look something like this
```bash
The following updates are available:

Package Name    Installed Version     Available Version     
libxslt1.1      1.1.35-1.2+deb13u1    1.1.35-1.2+deb13u2    
xsltproc        1.1.35-1.2+deb13u1    1.1.35-1.2+deb13u2    
```

To enable them run this

```bash
pvesh set /cluster/options --notify package-updates=always
```
I also like to install `apticron` and `apt-listchanges` which give a lot more details including changelogs
```bash
apt install apticron apt-listchanges
```

### Filter journal messages
This is only mildly related to PVE but I show it with a relevant feature. The QEMU Guest Agent.
It very often logs messages like these
```bash
info: guest-ping called
info: guest-fsthaw called
info: guest-fsfreeze called
```
If you want to prevent that you can use a service override like this
```bash
systemctl edit qemu-guest-agent
```
As filter you can use this
```bash
[Service]
LogFilterPatterns=~guest-ping
```

You can also filter for more things like this
```bash
[Service]
LogFilterPatterns=~guest-ping
LogFilterPatterns=~guest-fs(freeze|thaw)
```
I chose not to do that though as this happens rarely and might be useful for debugging issues.

You might have to reload the daemon like this (`systemctl edit` should to that already)
```bash
systemctl daemon-reload
```
and restart the service like this
```bash
systemctl restart qemu-guest-agent
```


### FAQ
#### Why not use `local` for guest disks?
File based disks (stored on `Directory` type storages) such as `.qcow2`, `.raw` and so on can have some issues.    
PVE does not enable the [`Content Type`s](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#storage_directory) of the `local` storage to store such files by default.    

- [They can be slow and inefficient](https://bugzilla.proxmox.com/show_bug.cgi?id=6140).
- [CTs only support `.raw` files](https://bugzilla.proxmox.com/show_bug.cgi?id=5814) [which provide no snapshot ability](https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_storage_types).
- Thin provisioning doesn't necessarily work
- You get worse/no support for less common storage configurations
- Uses the same storage as the OS/system
- No replication support