Theodoros Theodoridis
Theodoros Theodoridis
Cuda functions can be annotated with launch bounds, that is the maximum number of threads per block (the minimum blocks per multiprocessor can also be specified). This information is used...
Add CUPTI-based profiling functionality in `CudaRTCFun`. There are several performance metrics (listed [here](http://docs.nvidia.com/cuda/cupti/r_main.html#metrics-reference). Each metric requires measuring (possibly multiple types of) hardware events. Since not all events can be measured...
I noticed that in some cases the first statement after variable definitions in the kernel is a `__syncthreads();` which if I am not mistaken makes no sense. For example, in...
Hi, I'm trying to build OpenJPEG but symcc (clang 10.0.1) is crushing. I've built the master branch of symcc and `1f1e9682` of OpenJPEG with: `CC=~/symcc/build/symcc CXX=~/symcc/build/sym++ SYMCC_NO_SYMBOLIC_INPUT=1 SYMCC_LIBCXX_PATH=/usr/include/c++/v1 cmake .....
https://godbolt.org/z/71djY498P Given the following code: ```C void foo(void); static int a, c; static int *b = &a, *d = &c, *g = &a; static char e, h; static short f;...
https://godbolt.org/z/z3d5s1EdY Given the following code: ```C void foo(void); static struct { int b; } f; static int c, d = 3; static int *g = &c; static void h(); static...
```cpp void foo(void); static int e, f, h = 1; static int *g = &f; static unsigned i = 4; static char j, k; static char(a)(char b, int c) {...