Ajay Panyala
Ajay Panyala
@TejaX-Alaghari Thank you for this PR! I have built and successfully tested it on an application containing SYCL code. Just had to adjust the following line in [FindrocBLAS.cmake](https://github.com/TejaX-Alaghari/oneMKL/blob/rocblas_hip_support/cmake/FindrocBLAS.cmake#L61) to ```IMPORTED_LOCATION...
@TejaX-Alaghari @mmeterel I get a link error in the final step. ``` [100%] Linking CXX shared library ../../../../lib/libonemkl_blas_rocblas.so /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lclang_rt.builtins-x86_64 ``` Not sure if something needs to be...
> > @TejaX-Alaghari @mmeterel I get a link error in the final step. > > ``` > > [100%] Linking CXX shared library ../../../../lib/libonemkl_blas_rocblas.so > > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: cannot find -lclang_rt.builtins-x86_64...
I have the same issue with the GPU runs on [NERSC Perlmutter](https://docs.nersc.gov/systems/perlmutter/architecture/). I am running the cosma matrix-multiply miniapp with m=n=k=25000. It fails with OOM errors on even 100 nodes....
What are the default number of VCIs used by the ch4 config in 4.x ? If we set it to 1, can we expect the same behavior we used to...
@wavefunction91 @ryanstocks00 After a couple of minor tweaks to the build to enable successful `hipblas` discovery, I was able to build the code on Frontier. I made sure this PR...
No worries, I realized that just now reg. sn-K. The regular XC eval works fine. Can I commit the build system changes to this PR ?
> @ajaypanyala happy for you to commit changes to this PR - do you have the required permissions to push to the branch in my repo? @ryanstocks00 I do not...
@wavefunction91 Tested with Ubi/DZ (pbe0) on MI250X.
@wavefunction91 Is this ready to go (modulo the merge conflict) ?