jetpack icon indicating copy to clipboard operation
jetpack copied to clipboard

[Do not merge] Add Jetpack performance testing CI infrastructure

Open LiamSarsfield opened this issue 1 month ago • 1 comments

Addresses HOG-438: Create Jetpack Performance Tooling for LCP

Proposed changes:

  • Add performance testing infrastructure under tools/performance/ to measure wp-admin dashboard LCP (Largest Contentful Paint) with Jetpack connected
  • Uses Docker for isolated WordPress environment with simulated WordPress.com connection (fake tokens + mocked API with 200ms latency)
  • Includes CPU throttling calibration for consistent results across different machines
  • Posts metrics to CodeVitals for tracking over time

Other information:

  • [ ] Have you written new tests for your changes, if applicable?
  • [ ] Have you checked the E2E test CI results, and verified that your changes do not break them?
  • [ ] Have you tested your changes on WordPress.com, if applicable (if so, you'll see a generated comment below with a script to run)?

Jetpack product discussion

N/A - internal tooling

Does this pull request change what data or activity we track or use?

No

Testing instructions:

Prerequisites: Docker running, Node 18+, Jetpack built (pnpm jetpack build plugins/jetpack)

cd tools/performance
pnpm install
pnpm exec playwright install chromium
pnpm calibrate
pnpm test --skip-codevitals

Expected output: LCP measurement for wp-admin dashboard with Jetpack connected (simulated)

LiamSarsfield avatar Dec 12 '25 14:12 LiamSarsfield

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • :white_check_mark: Include a description of your PR changes.
  • :white_check_mark: Add a "[Status]" label (In Progress, Needs Review, ...).
  • :white_check_mark: Add a "[Type]" label (Bug, Enhancement, Janitorial, Task).
  • :white_check_mark: Add testing instructions.
  • :white_check_mark: Specify whether this PR includes any changes to data or privacy.
  • :white_check_mark: Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation :robot:


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!

github-actions[bot] avatar Dec 12 '25 14:12 github-actions[bot]

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WoA dev site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin (Jetpack or WordPress.com Site Helper), and enable the add/perf-testing-ci-mvp branch.
  • To test on Simple, run the following command on your sandbox:
bin/jetpack-downloader test jetpack add/perf-testing-ci-mvp
bin/jetpack-downloader test jetpack-mu-wpcom-plugin add/perf-testing-ci-mvp

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

github-actions[bot] avatar Dec 14 '25 19:12 github-actions[bot]

Code Coverage Summary

This PR did not change code coverage!

That could be good or bad, depending on the situation. Everything covered before, and still is? Great! Nothing was covered before? Not so great. 🤷

Full summary · PHP report · JS report

jp-launch-control[bot] avatar Dec 14 '25 20:12 jp-launch-control[bot]

Hey @anomiex👋 would you mind taking a look at this when you get a chance?

This is the performance testing infrastructure I built during HACK week (more details here pc9hqz-3Rb-p2) it measures wp-admin dashboard LCP for Jetpack and posts results to CodeVitals. It's a big PR, sorry about that. Most of it breaks down into:

  • scripts/ - Node.js orchestration and Playwright measurement code
  • docker/ - Docker Compose setup for WordPress + Jetpack environment
  • docker/mu-plugins/simulate-wpcom-connection.php - mu-plugin that fakes Jetpack connection and mocks WP.com API responses

The mu-plugin is probably the most relevant bit for review from a Jetpack perspective as it intercepts pre_http_request to return mock responses for various WP.com endpoints.

Happy to walk through any of it if that's easier.

LiamSarsfield avatar Dec 15 '25 13:12 LiamSarsfield

I'm not sure where it came from; I was given a noticed that MacOS's rsync doesn't handle symlinks well and stating I should brew install rsync. It let me proceed, but it failed.

Probably from https://github.com/Automattic/jetpack/blob/55e48ba3aaba88fba0f78577accbeb0cae1c5d85/tools/cli/commands/rsync.js#L45-L93

anomiex avatar Dec 16 '25 21:12 anomiex

Probably from

Yup! Thanks! I scanned the PR but didn't check the existing commands. I wouldn't consider it a blocker for the PR, but it is required to use the fully-leaded version of rsync, not the Mac variant.

kraftbj avatar Dec 16 '25 22:12 kraftbj

Yup! Thanks! I scanned the PR but didn't check the existing commands. I wouldn't consider it a blocker for the PR, but it is required to use the fully-leaded version of rsync, not the Mac variant.

~~Ah nice catch, I must have included already installed rsync previously hence why I missed it, I've updated the testing instructions accordingly.~~

Actually, this is no longer relevant - I've switched to using jetpack-production (the pre-built mirror) instead of building from the monorepo, so rsync is no longer needed at all. Updated the testing instructions to reflect the simpler prerequisites (just Docker and Node 18+).

LiamSarsfield avatar Dec 18 '25 15:12 LiamSarsfield

Hey @Automattic/jetpack-vulcan 👋

I've been working on performance testing infrastructure for Jetpack (measuring wp-admin LCP with Jetpack connected). Part of this involves an mu-plugin that simulates a WordPress.com connection without actually connecting to WP.com. Could someone from the team review the mock implementation at tools/performance/docker/mu-plugins/simulate-wpcom-connection.php?

It currently:

  • Sets up fake blog/user tokens via Jetpack_Options
  • Intercepts HTTP requests to *.wordpress.com and *.wp.com
  • Returns mock responses for common endpoints (token health, site info, stats, sync, etc.)
  • Adds configurable latency (default 200ms) to simulate real-world conditions

Specifically looking for feedback on:

  1. Are the fake connection tokens set up correctly?
  2. Are we missing any critical API endpoints that get called on wp-admin load?
  3. Any concerns with this approach for performance testing?

LiamSarsfield avatar Dec 18 '25 17:12 LiamSarsfield

Hi @LiamSarsfield ,

Thanks for the ping and working on this!

Are the fake connection tokens set up correctly?

Yes 👍

Are we missing any critical API endpoints that get called on wp-admin load?

This would depend on the Jetpack plugin module configuration. I can confirm the endpoints that are called by Connection and Sync packages, but we can't know how each consumer of the Jetpack Connection behaves. I noticed that you only enable modules that don't require a JP Connection but I wonder if we should enable all of them to get a realistic worst case scenario. One way to check the remote calls to WPCOM on every page load, would be to sandbox your environment and monitor debug.log as we do log all sandboxed requests to WPCOM there.

That said, it might make sense to add some logging that would answer this question within the performance testing logic itself? This could be helpful in case more endpoints are added in the future, that we don't handle within the testing infrastructure.

Any concerns with this approach for performance testing?

This is not a blocker, but you could consider refactoring get_mock_response to avoid the if else logic and use eg a factory for setting up the fake endpoints. One additional idea I had, is around the latency. Atm my understanding is that we assume a generic latency for each endpoint. If we extracted each fake endpoint definition to each own class we could potentially set the latency per endpoint using actual real world data we have on WPCOM. As an example, the jetpack-sync-actions endpoint has a ~470ms median response time and ~3s for the 95th percentile. We could take it even one step further and define a median and p95 latency per endpoint and repeat our tests for both cases to simulate a normal scenario and a site under stress.

fgiannar avatar Dec 19 '25 07:12 fgiannar

@fgiannar Thanks for the thorough review!

I noticed that you only enable modules that don't require a JP Connection but I wonder if we should enable all of them to get a realistic worst case scenario.

Good point. I initially took the conservative approach to avoid errors from modules expecting real WP.com responses, but enabling all modules would give us a more realistic measurement. I'll look into expanding the module list and adding mock responses for any additional endpoints they require.

It might make sense to add some logging that would answer this question within the performance testing logic itself?

Great idea, I added logging for any intercepted requests that hit the fallback response. That way it'll catch any unhandled endpoints as they appear instead of discovering them later.

You could consider refactoring get_mock_response to avoid the if else logic and use eg a factory for setting up the fake endpoints.

Agreed that the current implementation is a bit unwieldy. I'll refactor to a registry/factory pattern which will also make it easier to add per endpoint configuration.

We could take it even one step further and define a median and p95 latency per endpoint

Love this idea. Using actual latency data would make the measurements much more representative. I created a follow-up issue to:

  1. Extract endpoint definitions to a registry with configurable latency
  2. Add real-world latency values from WP.com metrics
  3. Consider test modes for median vs p95 scenarios

For now I'll focus on the logging improvement as a quick win, thanks again for the detailed feedback! 🙏

LiamSarsfield avatar Dec 19 '25 17:12 LiamSarsfield