[5.15-velinux] Backported KVM phase 2 patches for velinux-5.15 kernel
Hardware Feature Enablement / Support
These patches add support for new CPU features, CPUID leaves, and hardware capabilities on Intel and AMD platforms.
- x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
- KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022
- KVM: x86/svm/pmu: Add AMD PerfMonV2 support
- x86/cpu: Support AMD Automatic IBRS
- x86/cpu, kvm: Add the Null Selector Clears Base feature
- KVM: x86: add support for CPUID leaf 0x80000021
- KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
- perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
- KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
- KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
- KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests
Performance Improvements
Optimizations to reduce overhead and improve runtime efficiency:
- KVM: x86/pmu: Rewrite reprogram_counters() to improve performance
- KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
- KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
- KVM: x86/pmu: Use binary search to check filtered events
- KVM: x86/pmu: Avoid using PEBS perf_events for normal counters
Refactoring / Code Reorganization
These changes focus on improving maintainability, readability, and structure of the KVM/x86 codebase:
- KVM: x86/cpuid: Refactor host/guest CPU model consistency check
- KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper
- KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers
- KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code
- KVM: nVMX: Refactor PMU refresh to avoid referencing kvm_x86_ops.pmu_ops
- KVM: x86: Move guts of kvm_arch_init() to standalone helper
- KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86
- Move various helpers (e.g., pmc_perf_hw_id())
- KVM: x86: Use more verbose names for mem encrypt kvm_x86_ops hooks
Feature Enhancements (vPMU / CPUID / MSRs / etc.)
These provide new capabilities, configurable options, and fine-tuned control to user space or the guest:
- KVM: x86: Provide per VM capability for disabling PMU virtualization
- KVM: x86/svm: Add module param to control PMU virtualization
- KVM: x86/pmu: Restrict advanced features based on module enable_pmu
- KVM: x86/pmu: Advertise PERFCTR_CORE iff the min nr of counters is met
- KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
- KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT
- KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
- KVM: x86: Move lookup of indexed CPUID leafs to helper
User/Guest Safety / Fault Isolation
These prevent invalid or insecure guest/host interactions:
- KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled
- KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits
- KVM: x86/pmu: Prevent the PMU from counting disallowed events
- KVM: x86/pmu: Limit the maximum number of supported AMD GP counters
- KVM: x86/pmu: Limit the maximum number of supported Intel GP counters
- KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run
Bug Fixes
These patches address correctness issues, warnings, and reliability concerns in KVM and vPMU:
- KVM: x86: Fix errant brace in KVM capability handling
- KVM: x86/pmu: Fix type length error when reading pmu->fixed_ctr_ctrl
- KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
- KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
- KVM: x86: Fix pointer mistmatch warning when patching RET0 static calls
- KVM: x86: Fix clang -Wimplicit-fallthrough in do_host_cpuid()
- KVM: x86: avoid out of bounds indices for fixed performance counters
- KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
- kvm: x86/pmu: Fix the compare function used by the pmu event filter
Documentation or Cleanup / Misc
Cleanup and changes that make the codebase more developer-friendly:
- docs: kvm: x86: Fix broken field list
- KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask
- KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()
- KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names
- KVM: x86: Remove defunct pre_block/post_block kvm_x86_ops hooks
- KVM: x86: Move CPUID.(EAX=0x12,ECX=1) mangling to __kvm_update_cpuid_runtime()
- KVM: x86: use static_call_cond for optional callbacks
Run Test cases
$ git clone https://gitlab.com/kvm-unit-tests/kvm-unit-tests.git
$ cd kvm-unit-tests/
$ ./configure
$ make
$ ./run_tests.sh
root@volcano9b5e-os:/home/amd/Linux_Backport/kvm-unit-tests# ./run_tests.sh
PASS apic-split (56 tests)
PASS ioapic-split (19 tests)
PASS x2apic (56 tests)
FAIL xapic (timeout; duration=60)
PASS ioapic (26 tests)
SKIP cmpxchg8b (i386 only)
PASS smptest (1 tests)
PASS smptest3 (1 tests)
PASS vmexit_cpuid
PASS vmexit_vmcall
PASS vmexit_mov_from_cr8
PASS vmexit_mov_to_cr8
PASS vmexit_inl_pmtimer
PASS vmexit_ipi
PASS vmexit_ipi_halt
PASS vmexit_ple_round_robin
PASS vmexit_tscdeadline
PASS vmexit_tscdeadline_immed
PASS vmexit_cr0_wp
PASS vmexit_cr4_pge
PASS access (2 tests)
SKIP access_fep (test marked as manual run only)
SKIP access-reduced-maxphyaddr (/sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr not equal to Y)
PASS smap (18 tests)
PASS pku (7 tests)
SKIP pks (0 tests)
PASS asyncpf (2 tests, 1 skipped)
PASS emulator (140 tests, 2 skipped)
PASS eventinj (13 tests)
PASS hypercall (2 tests)
PASS idt_test (4 tests)
PASS memory (7 tests, 1 skipped)
PASS msr (1836 tests)
SKIP pmu (/proc/sys/kernel/nmi_watchdog not equal to 0)
SKIP pmu_lbr (/proc/sys/kernel/nmi_watchdog not equal to 0)
SKIP pmu_pebs (/proc/sys/kernel/nmi_watchdog not equal to 0)
SKIP vmware_backdoors (/sys/module/kvm/parameters/enable_vmware_backdoor not equal to Y)
PASS realmode
PASS s3
PASS setjmp (10 tests)
PASS sieve
PASS syscall (2 tests)
PASS tsc (6 tests)
PASS tsc_adjust (6 tests)
PASS xsave (17 tests)
PASS rmap_chain
FAIL svm
SKIP svm_pause_filter (1 tests, 1 skipped)
PASS svm_npt (103 tests)
SKIP taskswitch (i386 only)
SKIP taskswitch2 (i386 only)
PASS kvmclock_test
PASS pcid-enabled (2 tests)
PASS pcid-disabled (2 tests)
PASS pcid-asymmetric (2 tests)
PASS rdpru (1 tests)
PASS umip (21 tests)
SKIP la57 (i386 only)
SKIP vmx (0 tests)
SKIP ept (0 tests)
SKIP vmx_eoi_bitmap_ioapic_scan (0 tests)
SKIP vmx_hlt_with_rvi_test (0 tests)
SKIP vmx_apicv_test (0 tests)
SKIP vmx_posted_intr_test (0 tests)
SKIP vmx_apic_passthrough_thread (0 tests)
SKIP vmx_init_signal_test (0 tests)
SKIP vmx_sipi_signal_test (0 tests)
SKIP vmx_apic_passthrough_tpr_threshold_test (0 tests)
SKIP vmx_vmcs_shadow_test (0 tests)
SKIP vmx_pf_exception_test (0 tests)
SKIP vmx_pf_exception_test_fep (test marked as manual run only)
SKIP vmx_pf_vpid_test (test marked as manual run only)
SKIP vmx_pf_invvpid_test (test marked as manual run only)
SKIP vmx_pf_no_vpid_test (test marked as manual run only)
SKIP vmx_pf_exception_test_reduced_maxphyaddr (/sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr not equal to Y)
PASS debug (23 tests)
PASS hyperv_synic (1 tests)
PASS hyperv_connections (7 tests)
PASS hyperv_stimer (12 tests)
PASS hyperv_stimer_direct (8 tests)
PASS hyperv_clock (3 tests)
PASS intel_iommu (11 tests)
SKIP tsx-ctrl (1 tests, 1 skipped)
SKIP intel_cet (0 tests)
- Verified via debug logs
root@volcano9dee-host:/home/amd# dmesg | grep Malathi
[ 29.605656] amd_pmu_vs_enable_all is called : Malathi
[ 29.607491] amd_pmu_ _v2_disable_all is called : Malathi
[ 29.618834] amd_pmu_vs_disable_all is called : Malathi
[ 29.619491] amd_pmu_v2_enable_all is called : Malathi
[ 29.619530] amd_pmu_vs_disable_all is called : Malathi
[ 29.620491] amd_pmu_v2_enable_all is called : Malathi
[ 29.620493] amd_pmu_vs_disable_all is called : Malathi
[ 29.621491] amd_pmu_v2_enable_all is called : Malathi
[ 29.621491] amd_pmu_vs_disable_all is called : Malathi
[ 29.621491] amd_pmu_v2_enable_all is called : Malathi
[ 29.624441] amd_pmu_vs_disable_all is called : Malathi
- Validated PMU with perf
root@volcano9dee-host:/home/amd/upstream_work/original_kernel_velinux/kernel/tools/perf# ./perf stat -e cycles,instructions a.out sleep 3
Hello,World
Performance counter stats for 'a.out sleep 3':
5,74,208 cycles
6,08,549 instructions # 1.06 insn per cycle
0.008310204 seconds time elapsed
0.000000000 seconds user
0.000797000 seconds sys
root@volcano9dee-host:/home/amd/upstream_work/original_kernel_velinux/kernel/tools/perf# ./perf record
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.273 MB perf.data (1560 samples) ]
./perf report | grep pmu
0.11% swapper [kernel.kallsyms] [k] x86_pmu_disable
0.02% perf [kernel.kallsyms] [k] amd_pmu_v2_enable_all
- Booted VM with QEMU
vi vmc_script.sh //Copy the below script
#!/bin/bash
qemu-system-x86_64 \
-m 2048 \
-enable-kvm \
-cpu host \
-smp 2 \
-hda /vms/images/velinux_chaithu.qcow2 \
-cdrom /vms/iso/velinux-2.1-amd64-DVD-1.iso \
-boot d \
-netdev user,id=net0,hostfwd=tcp::2222-:22 \
-device virtio-net-pci,netdev=net0 \
-vnc :0
Run the script to create a VM
chmod +x vmc_script.sh
./vmc_script.sh