Use rdpmc() user-space function for reading Libpfm samples
Reading samples without invoking kernel-space code should be much faster. Thanks @deater for the info!
Yes - there are some caveats to watch out for; the PAPI folks had a talk at the ESPT workshop on how they implemented it. Might want to check out their paper.
On Fri, 17 Nov 2017, David Böhme wrote:
Yes - there are some caveats to watch out for; the PAPI folks had a talk at the ESPT workshop on how they implemented it. Might want to check out their paper.
sorry for the long delay in replying to this, but I've posted the ESPT'17 workshop paper (plus a related masters thesis) here: http://web.eece.maine.edu/~vweaver/projects/papi-rdpmc/
As for the code, the best reference is probably the code in PAPI itself, src/components/perf_event/perf_helpers.h in the recent PAPI 5.6 release.
The rdpmc code does give a nice speedup, it's an even better speedup if you're stuck running on a machine with the KPTI "Meltdown" vulnerability patches installed.
Vince