Enable interprocedural optimization
without inlining:
kom@nucat:~/proj/h3/build$ bin/benchmarkH3Api
-- geoToH3: 1.187343 microseconds per iteration (10000 iterations)
-- h3ToGeo: 0.796665 microseconds per iteration (10000 iterations)
-- h3ToGeoBoundary: 2.219408 microseconds per iteration (10000 iterations)
with inlining:
kom@nucat:~/proj/h3/build$ bin/benchmarkH3Api
-- geoToH3: 0.447104 microseconds per iteration (10000 iterations)
-- h3ToGeo: 0.265080 microseconds per iteration (10000 iterations)
-- h3ToGeoBoundary: 1.341663 microseconds per iteration (10000 iterations)
Addresses #325.
Thanks for the PR! Could you check how these flags compare to configuring CMake with -DCMAKE_BUILD_TYPE=Release?
It might be necessary to place these compile flags in the if(NOT WIN32) block starting on the next line if there are incompatabilities with MSVC. Would it make sense to enable these flags only for Release builds?
master on -DCMAKE_BUILD_TYPE=Release:
kom@nucat:~/proj/h3/build$ bin/benchmarkH3Api
-- geoToH3: 0.517386 microseconds per iteration (10000 iterations)
-- h3ToGeo: 0.295042 microseconds per iteration (10000 iterations)
-- h3ToGeoBoundary: 1.496694 microseconds per iteration (10000 iterations)
with this PR:
kom@nucat:~/proj/h3/build$ bin/benchmarkH3Api
-- geoToH3: 0.432304 microseconds per iteration (10000 iterations)
-- h3ToGeo: 0.257581 microseconds per iteration (10000 iterations)
-- h3ToGeoBoundary: 1.376957 microseconds per iteration (10000 iterations)
Things get inlined this way, but with this PR it's somehow even faster. Seems -DCMAKE_BUILD_TYPE=Release is a secret black magic that needs to be at least documented, and defaulted if possible. At least https://github.com/bytesandbrains/h3-pg/issues/23 got hit by it, and likely many others.
Hi @Komzpa - do you plan to debug the test failures in this diff?
@nrabinowitz feel free to take over it. I currently can't allocate a large chunk of attention to give an estimate when I'll look at it next.