Wei Wu
Wei Wu
Would you please tell me your environment? Which cuda version and how many GPUs are you using?
I reproduced your bug and found out the issue of seg fault came from the overflow of int. https://github.com/linnanwang/BLASX/blob/master/blas/blasx_dgemm.c#L64 I already created a PR and will merge it into master...
I feel like it is another overflow. I ran 50K*50K sgemm with 1 GPU and it works fine. I do not have a multi-gpu machine now. Can you try sgemm...
I have a standalone omp program which runs multi-threaded openblas dgemm under 4 concurrent pthreads, and it produced the correct flops and results. I compiled the openblas with `USE_OPENMP=1, NO_AFFINITY=0,...
I am not sure how to create multiple OpenMP runtime instances. OpenMP did not define it. The goal of my reproducer is to make sure that we are able to...
Is it possible to install hip-runtime-nvidia without installing cuda and cuda driver?
> @eddy16112, the cuda packages are currently a dependency for hip-runtime-nvidia and are required for installation. Most of the nvidia backend of hip are header files, so I do not...