qaboard icon indicating copy to clipboard operation
qaboard copied to clipboard

QA-Board for performance engineering

Open arthur-flam opened this issue 5 years ago • 1 comments

Right now QA-Board focuses on algorithm engineering. Another big area is software performance.

How do people track software performance?

Unit tests are not enough to judge software performance. Some organizations:

  • track their test suit runtime over time. It helps get a trend but comparisons are hard because the tests keeps changing.
  • use acceptance tests that check runtime/memory thresholds, and monitor regressions.

On the ops side, if we're talking about applications/services:

  • there are many great products: monitoring like datadog/newrelic, crash analytics like sentry...
  • smart monitoring solutions correlate anomalies with commits and feature flags.
  • the "future" is likely tooling based on canary deploys to identify perf regressions on real workflows.

For libraries or products used as dependencies by others, it's not possible to setup those tools. Could QA-Board help "shift-left" and help identify issues before releases?

Development workflows for performance engineering

  • Engineers doing optimization have a hard time keeping track of all their versions and microbenchmarks. The tooling is focused on the live experience (debuggers-like, checking the assembly) and investigate one version at a time.
  • To keep track, the best tool I've seen to identify issues ahead of time and help during coding is https://perf.rust-lang.org

Software engineers have the same need for "run tracking" as algorithm engineers.

Features needed

  • [x] Examples of integrations with tools such as perf.
  • Visualizations:
  • [ ] Examples of visualizations of metrics like binary size, IPC, time, page faults, gas..
  • [ ] We could add anomaly detection on top to warn about regressions early.

Reference: perf/profiling tools

arthur-flam avatar Jun 09 '20 18:06 arthur-flam

We love Brendan Gregg's flame charts and integrated Martin Spier's d3-flame-graph.

At a glance, you can check where you code spends its CPU cycles, and use differential flame graphs to debug regressions: https://samsung.github.io/qaboard/docs/visualizations/#flame-graphs image

For now we keep the issue open, we may turn it into a thread or "project"

arthur-flam avatar Jun 18 '20 18:06 arthur-flam