tp-libvirt icon indicating copy to clipboard operation
tp-libvirt copied to clipboard

PCI PT Hotplug/Hotunplug fixes for arch ppc64

Open TasmiyaNalatwad opened this issue 1 year ago β€’ 16 comments

Added 2 new test cases, where hotplog unplug is performed for multiple times and a random reboot is done between hotplug and unplug and the device availability is checked.

Below are the new cases added (7/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_hotunplug.PCI: STARTED (7/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_hotunplug.PCI: PASS (268.99 s) (8/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_unplug_with_rand_reboot.PCI: STARTED (8/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_unplug_with_rand_reboot.PCI: PASS (269.73 s)

Summary by CodeRabbit

  • New Features

    • Multi-cycle PCI hotplug/unplug with optional randomized reboots during stress runs.
    • Dynamic PCI controller allocation to avoid index conflicts.
    • Distro- and architecture-aware setup and session-based ping with timeout.
  • Bug Fixes

    • Improved detach/reattach reliability across architectures and libvirt versions.
    • More robust device queries and configuration persistence.
  • Tests

    • Added two variants covering multi-cycle and random-reboot scenarios.

TasmiyaNalatwad avatar Dec 17 '24 12:12 TasmiyaNalatwad

JOB ID : cd7960bd139abe0e6cd9b397722a5a75833f644b JOB LOG : /root/tests/results/job-2024-12-26T00.52-cd7960b/job.log (1/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.flood_ping.PCI: STARTED (1/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.flood_ping.PCI: PASS (108.67 s) (2/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.suspend.PCI: STARTED (2/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.suspend.PCI: PASS (110.53 s) (3/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.reboot.PCI: STARTED (3/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.reboot.PCI: PASS (133.24 s) (4/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dumpxml.PCI: STARTED (4/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dumpxml.PCI: PASS (108.58 s) (5/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.stress_test.PCI: STARTED (5/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.stress_test.PCI: PASS (108.46 s) (6/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dump.PCI: STARTED (6/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dump.PCI: FAIL: Failed to virsh dump of domain avocado-vt-vm1 (52.22 s) (7/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_hotunplug.PCI: STARTED (7/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_hotunplug.PCI: PASS (268.99 s) (8/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_unplug_with_rand_reboot.PCI: STARTED (8/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_unplug_with_rand_reboot.PCI: PASS (269.73 s) RESULTS : PASS 7 | ERROR 0 | FAIL 1 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0 JOB HTML : /root/tests/results/job-2024-12-26T00.52-cd7960b/results.html JOB TIME : 1279.13 s

TasmiyaNalatwad avatar Dec 26 '24 08:12 TasmiyaNalatwad

There is 1 Testcase failing here that is virdhdump test on Pci Passthrough. The test case needs more code work and enhancement to make it work on ppc64 arch. I will work on virshdump test separately and send a separate patch for that. This patch is for all the other hotplug tests working and passed.

(6/8) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dump.PCI: FAIL: Failed to virsh dump of domain avocado-vt-vm1 (52.22 s)

Thank you, Tasmiya

TasmiyaNalatwad avatar Apr 09 '25 03:04 TasmiyaNalatwad

@chloerh Would you please review the changes and if everything looks good can we proceed with the patch merge?

Thanks a lot

TasmiyaNalatwad avatar Aug 19 '25 09:08 TasmiyaNalatwad

@chloerh @chunfuwen @dzhengfy @luckyh Would you please review the changes and if everything looks good can we proceed with the patch merge?

Thanks a lot

TasmiyaNalatwad avatar Sep 03 '25 04:09 TasmiyaNalatwad

@chloerh @chunfuwen @dzhengfy @luckyh Would you please review the changes and if everything looks good can we proceed with the patch merge?

I have got 2 approvals. Please let me know anything needs to be taken care in this

Thanks a lot

TasmiyaNalatwad avatar Sep 09 '25 08:09 TasmiyaNalatwad

Walkthrough

Adds two PCI passthrough hotplug test variants to the configuration and extends the hotplug test script with dynamic PCI controller indexing, distro-aware package selection, arch- and libvirt-version-aware attach/detach behavior, persistent guest session ping, multi-cycle hotplug/unplug sequences, optional randomized reboots during stress loops, and updated cleanup that persists VM XML and reattaches devices when required.

Changes

Cohort / File(s) Summary
Config: new variants
libvirt/tests/cfg/passthrough/pci/libvirt_pci_passthrough_hotplug.cfg
Adds two variants: multiple_hotplug_hotunplug (multiple_hotplug_hotunplug = "yes", number_of_hotplug_unplug = 25) and multiple_hotplug_unplug_with_rand_reboot (random_reboot = "yes", number_of_hotplug_unplug = 30). No deletions.
Test logic: hotplug flow, arch/version handling, utilities
libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py
Adds platform and random imports; computes dynamic PCI controller index and applies controller type/model; adds distro-aware package selection for host tooling; replaces map(None, ...) usage with zip(...); introduces libvirt-version and architecture-specific hostdev detach/reattach logic; switches guest checks to utils_test.ping using persistent sessions and timeout; implements rand_reboot behavior and optional multi-cycle hotplug/hotunplug with interleaved random reboots; updates teardown to persist VM XML and reattach devices when necessary.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant T as Test Runner
  participant HV as Host/Libvirt
  participant VM as Guest

  rect rgb(230,240,255)
    note over T,HV: Init β€” platform, distro, libvirt checks
    T->>HV: Query existing PCI controllers
    HV-->>T: Controller list
    T->>T: Compute next_index, set controller type/model
  end

  rect rgb(240,255,240)
    loop Hotplug/Unplug cycles (N)
      T->>HV: Hotplug PCI hostdev(s)
      T->>VM: Open persistent session
      T->>VM: utils_test.ping(session, timeout)
      alt Detach required (arch/version)
        T->>HV: Detach PCI hostdev(s)
      else
        note right of T: Remain attached
      end
      T->>HV: Unplug if configured
    end
  end

  opt Random reboot enabled
    loop Extra cycles with possible random reboot
      T->>HV: Attach/Detach cycle
      T->>VM: rand_reboot(probabilistic)
      VM-->>T: Session re-established after reboot
    end
  end

  rect rgb(255,245,230)
    note over T,HV: Teardown
    T->>HV: Reattach devices if required
    T->>HV: Persist VM XML
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Pay attention to distro-specific package selection and conditional imports.
  • Review arch- and libvirt-version-dependent detach/reattach logic for correctness and coverage.
  • Verify persistent session usage with utils_test.ping and any session lifecycle handling.
  • Confirm the multi-cycle and randomized reboot logic does not introduce flakiness or resource leaks.

Poem

I twitch my whiskers at hotplug’s tune,
Slots click and sigh beneath the silicon moon.
Twenty-five hops, thirty with surprise,
Random reboots tumble from coded skies.
I hop, I test, I nibble a byte. πŸ‡βœ¨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
βœ… Passed checks (2 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The title 'PCI PT Hotplug/Hotunplug fixes for arch ppc64' clearly describes the main changes: adding PCI passthrough hotplug/hotunplug test fixes for ppc64 architecture.
✨ Finishing touches
  • [ ] πŸ“ Generate docstrings
πŸ§ͺ Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

πŸ“œ Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between ca1f42388f57db6b5ab80efaf243623efef706ea and 2daa678e76961f28d06db4240efbc7669d88b6e8.

πŸ“’ Files selected for processing (2)
  • libvirt/tests/cfg/passthrough/pci/libvirt_pci_passthrough_hotplug.cfg (1 hunks)
  • libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py (11 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • libvirt/tests/cfg/passthrough/pci/libvirt_pci_passthrough_hotplug.cfg
🧰 Additional context used
🧠 Learnings (4)
πŸ“š Learning: 2025-11-18T01:20:50.873Z
Learnt from: qiankehan
Repo: autotest/tp-libvirt PR: 6654
File: libvirt/tests/src/virtual_network/qemu/boot_with_multiqueue.py:56-58
Timestamp: 2025-11-18T01:20:50.873Z
Learning: In the tp-libvirt project, when calling session command methods and the status code is not needed, prefer using `session.cmd_output()` over `session.cmd_status_output()` to avoid unused variables.

Applied to files:

  • libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py
πŸ“š Learning: 2025-09-24T08:01:27.899Z
Learnt from: hholoubk
Repo: autotest/tp-libvirt PR: 6579
File: libvirt/tests/src/sriov/vIOMMU/iommu_device_lifecycle.py:95-97
Timestamp: 2025-09-24T08:01:27.899Z
Learning: In the libvirt test framework used in tp-libvirt, VM cleanup including destroying running VMs is handled by the teardown method (test_obj.teardown_iommu_test()) called in the finally block, so explicit VM destroy calls on timeout are not necessary according to the maintainers.

Applied to files:

  • libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py
πŸ“š Learning: 2025-11-18T08:47:14.443Z
Learnt from: smitterl
Repo: autotest/tp-libvirt PR: 6072
File: libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py:239-241
Timestamp: 2025-11-18T08:47:14.443Z
Learning: In tp-libvirt tests, `utils_test.ping` is a valid function that exists in the avocado-vt framework (virttest.utils_test module). It is an alias to `utils_net.ping` and supports parameters including dest, count, interface, timeout, and session.

Applied to files:

  • libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py
πŸ“š Learning: 2025-11-03T13:30:16.458Z
Learnt from: smitterl
Repo: autotest/tp-libvirt PR: 6531
File: libvirt/tests/src/virtual_network/qemu/pktgen_burst_mode_test.py:64-67
Timestamp: 2025-11-03T13:30:16.458Z
Learning: In tp-libvirt tests, prefer using a list of cleanup lambda functions that are appended when actions requiring cleanup are performed, then executed in a finally block with individual error handling. Example: `cleanup_actions.append(lambda: undo_action())` followed by iterating through the list in reverse order in the finally block.

Applied to files:

  • libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py
🧬 Code graph analysis (1)
libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py (3)
libvirt/tests/src/virsh_cmd/domain/virsh_attach_device.py (1)
  • Controller (582-612)
libvirt/tests/src/passthrough/robustness/passthrough_robustness.py (1)
  • start (176-225)
libvirt/tests/src/virtual_network/iface_bridge.py (1)
  • ping (83-98)
πŸͺ› Ruff (0.14.5)
libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py

157-157: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


173-173: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


308-308: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python 3.12
  • GitHub Check: Python 3.9
  • GitHub Check: Python 3.11
  • GitHub Check: Python 3.8
πŸ”‡ Additional comments (5)
libvirt/tests/src/passthrough/pci/libvirt_pci_passthrough_hotplug.py (5)

308-308: Static analysis hint is a false positive.

The static analysis tool flags random.random() as unsuitable for cryptographic purposes (S311). This is a false positiveβ€”the random number generator is used for test randomization (probabilistic reboots), not for security-sensitive operations, so random.random() is appropriate here.

The random reboot logic was reviewed and approved by @smitterl in past comments.


330-344: LGTM! New test loops are well-structured.

The two new test variants are clearly separated and correctly implement the stated functionality:

  • multiple_hotplug_hotunplug: Performs repeated hotplug/hotunplug cycles without reboots
  • random_reboot: Performs hotplug/hotunplug cycles with probabilistic reboots between operations

The logging statements provide good visibility into test progress, and the random reboot logic (30% probability per iteration) is sound.


195-200: LGTM! Architecture-specific detach logic is correct.

The conditional device detach logic properly handles the different requirements for ppc64le (detach on newer libvirt) versus other architectures (detach on older libvirt). This was reviewed and approved in past comments by @smitterl.


103-117: LGTM! Dynamic controller indexing is well-implemented.

The logic correctly:

  1. Retrieves existing controllers of the specified type and model
  2. Extracts their index values and finds the maximum
  3. Calculates the next available index as max(indices) + 1
  4. Falls back to the configured index when no controllers exist

The get_controllers method was verified to exist in past review comments.


157-157: The review suggestion cannot be applied - strict=True requires Python 3.10+, but tp-libvirt supports Python 3.6-3.9.

The CI workflow tests Python versions 3.6, 3.7, 3.8, and 3.9. The strict=True parameter for zip() was introduced in Python 3.10, making this suggestion incompatible with the project's minimum supported Python version. Adding strict=True would break compatibility without increasing the minimum Python version requirement.

Likely an incorrect or invalid review comment.

[!TIP]

πŸ“ Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests β€” including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. πŸ“ Description β€” Summarize the main change in 50–60 words, explaining what was done.
  2. πŸ““ References β€” List relevant issues, discussions, documentation, or related PRs.
  3. πŸ“¦ Dependencies & Requirements β€” Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. πŸ“Š Contributor Summary β€” Include a Markdown table showing contributions: | Contributor | Lines Added | Lines Removed | Files Changed |
  5. βœ”οΈ Additional Notes β€” Add any extra reviewer context. Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Sep 17 '25 08:09 coderabbitai[bot]

@chloerh @chunfuwen @dzhengfy @luckyh May I request to review and merge this commit.

hariharants avatar Sep 22 '25 05:09 hariharants

@smitterl as a "weird-arch" expert, could you please take a look at this?

ldoktor avatar Sep 23 '25 07:09 ldoktor

@smitterl Thank a lot for the valuable review comments and your time. Apologies for late response. I have tried incorporating all the review comments as suggested. Requesting you to please have a look and let me know your inputs.

I am adding the test results after taking care of all the review comments. Please find the results below. For rand_reboot test it took 6 rem=boots randomly after fewer iterations.

python3 avocado-setup.py --run-suite guest_pci_passthrough --guest-os RHEL.8.devel.ppc64le --only-filter 'virtio_scsi virtio_net qcow2' --no-download
07:32:05 INFO    : Env Config: /home/tests/config/wrapper/env.conf
07:32:05 INFO    : No Run Config: /home/tests/config/wrapper/no_run_tests.conf
07:32:05 WARNING : Overriding user setting and enabling kvm bootstrap as guest tests are requested
07:32:05 INFO    : Check for environment
07:32:06 INFO    :
07:32:06 INFO    : Running Guest Tests Suite pci_passthrough
07:32:06 INFO    : Running: /usr/local/bin/avocado run --vt-type libvirt --vt-config /home/tests/data/avocado-vt/backends/libvirt/cfg/pci_passthrough.cfg                 --force-job-id 61ffc34bbcbd8324b97395a6d245d74423bc3db2                 --job-results-dir /home/tests/results  --vt-only-filter                                             "virtio_scsi virtio_net qcow2 RHEL.8.devel.ppc64le"
JOB ID     : 61ffc34bbcbd8324b97395a6d245d74423bc3db2
JOB LOG    : /home/tests/results/job-2025-10-07T07.32-61ffc34/job.log
 (1/9) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native: STARTED
 (1/9) io-github-autotest-qemu.unattended_install.import.import.default_install.aio_native: PASS (41.12 s)
 (2/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.flood_ping.PCI: STARTED
 (2/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.flood_ping.PCI: PASS (85.67 s)
 (3/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.suspend.PCI: STARTED
 (3/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.suspend.PCI: PASS (85.48 s)
 (4/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.reboot.PCI: STARTED
 (4/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.reboot.PCI: PASS (117.26 s)
 (5/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dumpxml.PCI: STARTED
 (5/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dumpxml.PCI: PASS (85.30 s)
 (6/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.stress_test.PCI: STARTED
 (6/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.stress_test.PCI: PASS (84.73 s)
 (7/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dump.PCI: STARTED
 (7/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.virsh_dump.PCI: PASS (90.69 s)
 (8/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_hotunplug.PCI: STARTED
 (8/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_hotunplug.PCI: PASS (1388.70 s)
 (9/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_unplug_with_rand_reboot.PCI: STARTED
 (9/9) type_specific.io-github-autotest-libvirt.libvirt_pci_passthrough_hotplug.multiple_hotplug_unplug_with_rand_reboot.PCI: PASS (1491.24 s)
RESULTS    : PASS 9 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB HTML   : /home/tests/results/job-2025-10-07T07.32-61ffc34/results.html
JOB TIME   : 3501.54 s
08:30:30 INFO    :
08:30:30 INFO    : Summary of test results can be found below:
TestSuite                                                                                TestRun    Summary

guest_pci_passthrough                                                                    Run        Successfully executed
/home/tests/results/job-2025-10-07T07.32-61ffc34/job.log
| PASS 9 || CANCEL 0 || ERRORS 0 || FAILURES 0 || SKIP 0 || WARN 0 || INTERRUPT 0 |

Test suites status:

TOTAL                1
RUN                  1
NOT_RUN              0
CANT_RUN             0

Final count summary for tests run:

TOTAL                9
PASS                 9
CANCEL               0
ERRORS               0
FAILURES             0
SKIP                 0
WARN                 0
INTERRUPT            0
08:30:30 INFO    : Removing temporary mux dir

Thank you -Tasmiya

TasmiyaNalatwad avatar Oct 07 '25 12:10 TasmiyaNalatwad

@smitterl Would you please review the PR and let me know the inputs. Thank you

TasmiyaNalatwad avatar Oct 09 '25 14:10 TasmiyaNalatwad

@smitterl Would you please review the changes made. Thank you!

TasmiyaNalatwad avatar Oct 13 '25 19:10 TasmiyaNalatwad

@smitterl All review comments have been addressed. Please let me know if any further changes are needed. Awaiting the next steps.

TasmiyaNalatwad avatar Oct 30 '25 09:10 TasmiyaNalatwad

@smitterl Thanks for reviewing the patch, @TasmiyaNalatwad addressed all your review comments. May I request you to review the changes. Thanks in advance.

harihare avatar Nov 06 '25 07:11 harihare

@smitterl Thanks a lot for your time and valuable review comments. I have addressed all the review comments mentioned.

Would you please approve and help me in merging this PR. Thanks in Advance.

TasmiyaNalatwad avatar Nov 18 '25 12:11 TasmiyaNalatwad

@smitterl As all the review comments are been addressed, Would you please approve and help me in merging this PR. Thanks in Advance.

TasmiyaNalatwad avatar Nov 20 '25 13:11 TasmiyaNalatwad