pgstac icon indicating copy to clipboard operation
pgstac copied to clipboard

Include performance benchmarking as part of integration testing.

Open sharkinsspatial opened this issue 3 years ago • 6 comments

@philvarner has released https://github.com/stac-utils/stac-api-benchmark to allow benchmarking query consistency and performance across STAC API implementations. New PRs should probably run these benchmarks as part of the integration testing pipeline and compare results against previous branches to identify consistency or performance regresssions.

sharkinsspatial avatar May 23 '22 15:05 sharkinsspatial

I'll add a quiet mode today that only outputs a final JSON results file.

Also, the "random queries" test requires (I think requires?) three queryables that take a value 0-100, and I need to make this a bit more flexible.

philvarner avatar May 23 '22 16:05 philvarner

Does this look reasonable for out output format? (the numbers are seconds of runtime, they're really low b/c I only ran a few for each benchmark)

{
  "step": 0.4397075420129113,
  "tnc": 0.4970361669547856,
  "countries_apr_2019": 0.3382589580141939,
  "countries_cloud_cover_asc": 0.39972841599956155,
  "random_queries": 47.06457624997711,
  "repeated": 17.237872541998513,
  "sort_cloud_cover_desc": [
    {
      "sort": "sentinel-2-l2a_properties.eo:cloud_cover_desc",
      "duration": 0.38575045799370855
    }
  ],
  "sort_cloud_cover_asc": [
    {
      "sort": "sentinel-2-l2a_properties.eo:cloud_cover_asc",
      "duration": 0.3100657499744557
    }
  ],
  "sort_datetime_desc": [
    {
      "sort": "sentinel-2-l2a_properties.datetime_desc",
      "duration": 0.054874249966815114
    }
  ],
  "sort_datetime_asc": [
    {
      "sort": "sentinel-2-l2a_properties.datetime_asc",
      "duration": 0.34525220800423995
    }
  ],
  "sort_created_desc": [
    {
      "sort": "sentinel-2-l2a_properties.created_desc",
      "duration": 0.15877308399649337
    }
  ],
  "sort_created_asc": [
    {
      "sort": "sentinel-2-l2a_properties.created_asc",
      "duration": 0.05182845803210512
    }
  ]
}

philvarner avatar May 23 '22 18:05 philvarner

👍 @philvarner

sharkinsspatial avatar May 23 '22 19:05 sharkinsspatial

merged to main. Best way to run it is probably to set --verbosity ERROR so any of that usual output from that doesn't interfere with the results.

philvarner avatar May 23 '22 20:05 philvarner

+1 on performance testing!

https://github.com/stac-utils/stac-api-benchmark to allow benchmarking query consistency and performance across STAC API implementations

To me it seems that this is a benchmark that should be run in each stac-fastapi backend, which as far as I understand might not always be up to date with pgstac.

I'll try to start something using maybe pytest-benchmark to tests the SQL methods and then use maybe https://github.com/benchmark-action/github-action-benchmark to make sure we get a report

vincentsarago avatar Jul 28 '22 09:07 vincentsarago

I've started a really quick demo over https://github.com/vincentsarago/pgstac-benchmark

I wonder now what are the feature we want to benchmark?

vincentsarago avatar Jul 28 '22 16:07 vincentsarago