GPUExample
GPUExample copied to clipboard
threadgroup size calculation incorrect
your calculation is confusing but it appears you are not accounting for integer truncation