BLASX icon indicating copy to clipboard operation
BLASX copied to clipboard

a heterogeneous multiGPU level-3 BLAS library

Results 9 BLASX issues
Sort by recently updated
recently updated
newest added

This might be a naive question.... It is mentioned in paper that GPU task can be bonded to a CPU thread...? I am wondering is any references discuss more details...

I am wondering how the GEMM is implemented, is it like, CPU RAM store all the matrix A and B. Suppose we have 2 GPUs and we send A(i, k)...

Hello, there are still errors when applying the library to a large matrix gemm on multiple GPUs. I need to find another library that can replace cublasXt and execute large-scale...

See changes in forked version: https://github.com/pseudotensor/BLASX (commit a9b22933ae395607c000bc7bc616783636464108). All inlined functions would be not found during linking otherwise.

Running the testing/gemm.c with only sgemm (commenting out dgemm code) and larger matrices: int loop = 0; for (loop = 1; loop < 2; loop++) { int M = 10000;...

question

Hello! This library looks great, but I was wondering if it has CPU multi-threading blas capabilities. Reading through the code for some of the *gemm files, it almost appears to...

question

I've modified the gemm-example to use dgemm only with matrices of dimension 30000x30000. Using a server with 4 GTX Titan cards the program produces a segfault. It seems that there...

enhancement

Thanks alot for providing the code, unfortunately I have a problem with the compilation. ; LRU.o:/home/burger/BLASX/blas/LRU.c:4: first defined here /usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: blasx_zgemm.o:/home/burger/BLASX/blas/blasx_zgemm.c:3: multiple definition of `cuda_sta'; blasx_sgemm.o:/home/burger/BLASX/blas/blasx_sgemm.c:3: first defined here