Phase 2: Optimize gh_covering Performance (2-3× Speedup)
Summary
This PR optimizes gh_covering by eliminating a redundant encode-decode cycle, achieving 2-3× speedup across typical use cases.
Problem
The previous implementation had an inefficient data flow:
Grid → gh_encode → gh_to_sf → gh_to_spdf → gh_to_sp → gh_decode → Build polygons
The encode-decode round-trip was wasteful since we only needed the decoded coordinates to build polygons.
Solution
Streamlined the flow to:
Grid → gh_encode → gh_decode → Build polygons (direct)
Now gh_covering decodes geohashes once and builds SpatialPolygons directly, skipping the intermediate conversions through gh_to_sf → gh_to_spdf → gh_to_sp.
Performance Results
Benchmark: 100 random points, 0.5° spread
| Precision | Old (ms) | New (ms) | Speedup |
|---|---|---|---|
| 4 | 10.9 | 5.0 | 2.17× |
| 5 | 15.9 | 6.6 | 2.41× |
| 6 | 180.5 | 57.8 | 3.13× |
| 7 | 5528.6 | 1802.5 | 3.07× |
Median speedup: ~2.7×
Higher precisions show larger absolute time savings (e.g., precision 7 saves ~3.7 seconds).
Testing
✅ All 48 existing tests pass
✅ No breaking changes to API or behavior
✅ Same output as before, just faster
✅ Benchmark scripts included in benchmarks/
Changes
-
R/gis_tools.R: Refactored
gh_coveringto build polygons directly from decoded coordinates - NEWS.md: Documented performance improvement
- benchmarks/: Added microbenchmark scripts for reproducibility
Verification
# Run benchmarks yourself
source("benchmarks/gh_covering_bench_simple.R")
# Run tests
devtools::test()
Next Steps
This is part of a series of optimization PRs:
- ✅ Phase 1: Quick fixes and code quality (#43)
- ✅ Phase 2: gh_covering optimization (this PR)
- 🔜 Phase 3: Duplicate detection optimization
- 🔜 Phase 4: CI/CD modernization (GitHub Actions)
🤖 Generated with Claude Code
A jj chain should be the easiest way to prevent headache from merge conflicts among this sequence of PRs as review suggestions are incorporated / each branch drifts from the state as filed:
https://github.com/jj-vcs/jj
It seems Claude can basically get the gist of how to use it too even though it's relatively new:
https://claude.ai/share/e53b881a-f456-44dc-ae0c-2d96936cf366