miniAero
miniAero copied to clipboard
MiniAero gets only 20% of Roofline model
Hi,
I am a studen of Computational Engineering at Friedrich-Alexander-University Erlangen-Nuernberg in Germany. This semester i take a seminar about benchmarking multi core architectures. My task is to evaluate MiniAearo/Kokkos by using the roofline model. I did some measurements on a Tesla K40 and on a numa machine (OpenMP) with 2x Intel E5-2650v2. I barely reach 20% of the roofline on both architectures by using the inputfile for the ramp test in the test folder with varying amounts of cells. What could be the reason for the perfomance to be so low or am I using unreasonable problem sizes (eg. 128x128x32)?
Best regards, Johannes