Akira Naruse comments

Results 13 comments of


                                            Akira Naruse

T3: NVIDIA Single GPU (cuanns_ivfpq)

Thank you for re-running it. I've checked the config and the results. It looks good to me!

Use cuSPARSE Generic API instead of older one documented to be removed

> @anaruse Have you ever heard any cases like [this](https://github.com/cupy/cupy/pull/7052#issuecomment-1275512137) about the performance of cusparseSpSM? (I'm afraid that I don't immediately have naive C/C++ replicator) It is quite slow.. Please...

Use cuSPARSE Generic API instead of older one documented to be removed

Just checking still, I took a profile with profiler.time_range (nvtx) and confirmed that, indeed, the cusparseSpSM_analysis() time is very long.. ![image](https://user-images.githubusercontent.com/20274068/195556326-3f71446c-c0a5-47ec-b5e5-3371a986e65a.png)

CI: CUDA Python test is broken

I built CuPy (v11) with the following combinations and tested `cupy_tests/core_tests/test_userkernel.py` and `cupy_tests/cuda_tests/test_texture.py`. The error occurred only with the combination of CUDA 11.5.0 and CUDA-python 11.7.1. For CUDA 11.5, is...

CI: CUDA Python test is broken

The CUDA python team's investigation has revealed that this is not directly due to CUDA python, but is essentially due to backward compatibility issues with the texture related ABIs in...

[QST] How can we use cagra search with both ranked based optimization and distance-based optimization?

Your question is whether we can choose to use rank based optimization or distance based optimization when optimizing the initial KNN graph to create a search graph. The answer is...

Support for half-precision complex numbers?

You can get a better matrix multiply performance with TensorCore, but fp16 complex numbers are not supported by CUDA libraries except [cublasLtMatmul()](https://docs.nvidia.com/cuda/cublas/index.html#cublasLtMatmul) in cuBLASTLt. Even with cublasLtMatmul(), all matrices must...

Akira Naruse

T3: NVIDIA Single GPU (cuanns_ivfpq)

Use cuSPARSE Generic API instead of older one documented to be removed

Use cuSPARSE Generic API instead of older one documented to be removed

CI: CUDA Python test is broken

CI: CUDA Python test is broken

[QST] How can we use cagra search with both ranked based optimization and distance-based optimization?

Support for half-precision complex numbers?

[FEA] Added functionality to ElementwiseKernel

Q27 intermittent failure in nightly automation

[BUG] Low recall with CAGRA when sparsity or dimensionality is high