ptx topic
how-to-optimize-gemm
row-major matmul optimization
ILGPU
ILGPU JIT Compiler for high-performance .Net GPU programs
ptformat
Free software file format parser for Avid ProTools sessions
CudaPAD
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
less_slow.cpp
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
MTB
Energinets Model Testbench. Automate gridcompliance studies in PSCAD and Powerfactory.
tornadovm-examples
Set of examples written for hardware acceleration via TornadoVM