egaudry

Results 19 comments of egaudry

I'd like to second that : GEMMT feature's implementation would be great :).

Thanks for the pointer, I still believe this feature might be offered by the BLAS it-self (as with other BLAS implementations on the market).

agreed, however when looking at MKL and BLIS, you can see them supported as it enables extra performance on various workload (MUMPS Direct Sparse Solver for instance); it seems like...

fair point @martin-frbg; I'll stop arguing as I cannot devote time to help :)

Thanks for your feedback Devin. I understand the decision made regarding benchmark contents. I believe it would be interesting to bench again (or report what was not already published) recent...

@fgvanzee fully understood, keep up the good work :).

I'm just adding a comment regarding the last release notes for MKL (oneAPI 2022.1): https://www.intel.com/content/www/us/en/developer/articles/system-requirements/oneapi-math-kernel-library-system-requirements.html They state they support Intel chips only, with no mention (at all) for others. I...

Thanks Dave. The obsession might be linked to the fact that it has been a goto solution (for different reasons) for years, which in turn means that one tends to...

For the sake of gathering information, here are some abstract from the ARM technical reference paper referenced above: ``` A6.1 About the L1 memory system The Neoverse V1 L1 memory...

Retrieved from a running system: ``` /sys/devices/system/cpu/cpu0/cache/index0/allocation_policy:ReadWriteAllocate /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size:64 /sys/devices/system/cpu/cpu0/cache/index0/level:1 /sys/devices/system/cpu/cpu0/cache/index0/number_of_sets:1 /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_list:0 /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map:00000001 /sys/devices/system/cpu/cpu0/cache/index0/type:Data /sys/devices/system/cpu/cpu0/cache/index0/write_policy:WriteBack /sys/devices/system/cpu/cpu0/cache/index1/allocation_policy:ReadAllocate /sys/devices/system/cpu/cpu0/cache/index1/coherency_line_size:64 /sys/devices/system/cpu/cpu0/cache/index1/level:1 /sys/devices/system/cpu/cpu0/cache/index1/number_of_sets:1 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_list:0 /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map:00000001 /sys/devices/system/cpu/cpu0/cache/index1/type:Instruction /sys/devices/system/cpu/cpu0/cache/index1/write_policy:WriteBack /sys/devices/system/cpu/cpu0/cache/index2/allocation_policy:ReadWriteAllocate /sys/devices/system/cpu/cpu0/cache/index2/coherency_line_size:64 /sys/devices/system/cpu/cpu0/cache/index2/level:2 /sys/devices/system/cpu/cpu0/cache/index2/number_of_sets:1 /sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_list:0 /sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_map:00000001 /sys/devices/system/cpu/cpu0/cache/index2/type:Unified /sys/devices/system/cpu/cpu0/cache/index2/write_policy:WriteBack...