nvbench
nvbench copied to clipboard
CUDA Kernel Benchmarking Library
Right now it seems the intended method for benchmarking is simply encapsulating the kernel like so: ```cpp void my_benchmark(nvbench::state& state) { state.exec([](nvbench::launch& launch) { my_kernel(); }); } NVBENCH_BENCH(my_benchmark); ``` What...
I have some slides for this, it's just a matter of embedding them into the docs and doing a writeup.  
This PR adds visualization to the comparison script. Here's an example for: ```cpp // ... .add_int64_power_of_two_axis("elements", nvbench::range(12, 28, 4)) .add_int64_axis("ratio", {0, 10}); ``` When script is executed, one can specify...
Several users have asked about how to add custom arguments to their benchmarks (e.g. #86). This is done by implementing an application specific `main(argc, argv)` function and linking to the...
We recently had an issue where a benchmark kernel caused an illegal memory access, and the error was asynchronously reported in an unrelated NVBench CUDA API call. Any errors emitted...
Changes in the work include: - [x] Internally use linear_space for iterating - [x] Simplify type and value iteration in `state_iterator::build_axis_configs` - [x] Store the iteration space in `axes_metadata` -...
Not specifying the **optional** `NVBENCH_DECLARE_ENUM_TYPE_STRINGS` causes compiler errors: ``` error: no instance of overloaded function "std::to_string" matches the argument list ``` This should be easy to reproduce by commenting out...
I would be nice to be able to specify ```c++ NVBENCH_BENCH_TYPES(my_benchmark, NVBENCH_TYPE_AXES_OPT({my_types, my_optional_types})) .set_type_axis_names({"ValueType"}); ``` such that `my_bench_executable --benchmark my_benchmark` does only execute (the Cartesian product of) the non-optional `my_types`,...
It would enable a nicer (shorter) way of specifying benchmarks in the CLI interface if one could define groups/sets of benchmarks in the source code. One way of doings this...
As @jrhemstad mentioned [here](https://github.com/NVIDIA/nvbench/issues/82#issuecomment-1091761596) one could add hidden benchmarks using a "hidden" tag using a tag feature as proposed in #81 to be able to define benchmarks that will not...