test: gather baselines for serialize bench
Reason for This PR
Fixes #2967
Description of Changes
We would only have baselines for Intel. We now can run this test on all supported CPUs.
- [ ] This functionality can be added in
rust-vmm.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
PR Checklist
[Author TODO: Meet these criteria.]
[Reviewer TODO: Verify that these criteria are met. Request changes if not]
- [ ] All commits in this PR are signed (
git commit -s). - [ ] The issue which led to this PR has a clear conclusion.
- [ ] This PR follows the solution outlined in the related issue.
- [ ] The description of changes is clear and encompassing.
- [ ] Any required documentation changes (code and docs) are included in this PR.
- [ ] Any newly added
unsafecode is properly documented. - [ ] Any API changes follow the Runbook for Firecracker API changes.
- [ ] Any user-facing changes are mentioned in
CHANGELOG.md. - [ ] All added/changed functionality is tested.
Change looks good to me. Interesting that we have performance targets configured but not run for (automatically) up until now.
Change looks good to me. Interesting that we have performance targets configured but not run for (automatically) up until now.
Indeed it is a little confusing. I did a git blame on the file and uncovered that this is what the original version of the file looked like: https://github.com/firecracker-microvm/firecracker/pull/2342/files. Looks like a miss really since we already had ARM and AMD support.
Thanks @mattschlebusch and @bchalios for the review! As you can see in the failing test the AMD deserialize baseline needs some adjustments. I did some digging to see why this happened. I ran the benchmarks again on the same kernel image (4.14.268) on which I gathered them initially and the results are still the ones I gathered. Then I updated the host to use latest AL2 kernel available (4.14.276) for m6a metal and the results changed (only for the deserialize benchmark). I am seeing an improvement there from 0.05 to 0.0356. What I need to do now is also test this on 5.10 and see what is the behavior there. We might need to have different baselines based on kernel version. Will come back.
@bchalios @mattschlebusch I regathered the baselines and tested them against multiple runs. All seems good now, can you please review again? Thanks