Velocity-Bench icon indicating copy to clipboard operation
Velocity-Bench copied to clipboard

Results 11 Velocity-Bench issues
Sort by recently updated
recently updated
newest added

- Updated all three versions to run the code in main for 50 iterations. - Output average time for the 50 iterations. - This will eliminate the impact of different...

- Updated SYCL and versions to use host/device USM instead of shared to improve performance. - Updated SIMD width in SYCL version from 32 to 16 for better performance.

The cudaSift benchmark currently hardcodes a relative path to its assets. See [here](https://github.com/oneapi-src/Velocity-Bench/blob/main/cudaSift/SYCL/mainSift.cpp#L94). This works fine as long as the benchmark is ran from `repo/cudaSift/SYCL/build/` directory, but breaks down when...

Use the wrapper function from `infrastructure/SYCL.h` (introduced in #95) to call either `host_task` or native command submission extensions when available. All changes are exclusively in the code paths taken by...

In copyParticleVault_d2h() the memcpy into a local variable did not have a wait after it. This meant it may not be complete by the time it is used as an...

Adding end of program to Easywave.cpp

Hello all, We have been working with the Codeplay engineers ( @rafbiels ) using the Velocity Benchmarking suite to investigate the overhead of OneAPI. When looking at the performance, we...

Improve SYCL performance on CUDA and HIP backends with the two changes below. There is no functional change for Intel backends. #### 1. Add CMake option to use in-order queue...

Store the input file `a9a` only once in the `data` directory instead of storing three copies. Also store only one version of the reference output `a.m` (even if not used...