Encountered an error when using cuSPARSELt with multithreading: CUSPARSE API failed with internal error (7)
Configurations
- cuSPARSELt version: 0.6.2
- Hardware: A10 with 2 cards
- cuda version: 12.1
- Driver: 550.90.07
Problem
Our team is integrating cuSPARSELt into a custom Inference Engine to improve quantization performance. We've successfully run Qwen2-7B (a large model structurally similar to LlaMA3) on a single GPU. However, when using multiple GPUs, cusparseLtMatmul throws an internal error. I've identified that this issue occurs only with multi-threading, while multi-processing functions without problems. This problem can be reproduced using the GitHub example matmul_example.cpp.
Error Log
CUSPARSE API failed at line 287 with error: internal error (7)
How to reproduce
Modify latest cuSPARSELt/matmul/matmul_example.cpp to add multithreading:
#include <thread>
int run(int device_id, cudaStream_t stream) {
CHECK_CUDA(cudaSetDevice(device_id));
CHECK_CUDA(cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking));
<same with original code>
}
int main(void) {
const int numThreads = 2;
cudaStream_t streams[numThreads];
std::thread threads[numThreads];
for(int i = 0; i< numThreads; i++) {
threads[i] = std::thread(run, i, streams[i]);
}
for (int i =0; i<numThreads;i++) {
threads[i].join();
}
}
The complete modified C++ file can be downloaded from the forked repo.
Question
Does cuSPARSELt support multithreading? If it supports multithreading, are we implementing it incorrectly in matmul_example.cpp?
@x574chen cusparselt does support multithreading.
I wasn't able to reproduce the failure using your code. Could you try
- changing line 260 to cudaStream_t streams[1] = {&stream};
- setting env CUSPARSELT_LOG_LEVEL=5 and share the log?
@j4yan Hi, I updated matmul.cpp and got the same error.
Log: cusparse_error.log
Which docker are you using? If it is public, I could try it.
@x574chen I was able to reproduce the error in certain environment. Will need more time looking into it.
Hi @x574chen it turns out cusparselt doesn't work as expected. We are working on it and hopefully fix the issue in future release.
The latest version solved the multithreading issue. Closing it