compute-runtime icon indicating copy to clipboard operation
compute-runtime copied to clipboard

Runtime hangs on DG2 (and Gen12 iGPU maybe?)

Open tazz4843 opened this issue 1 year ago • 3 comments

I'm running into random hangs when my app is running during normal use, that began occurring several months ago, roughly September 2023. A stack trace is attached, see end for it. I was doing some digging and found this related comment with the exact same stack trace, although only on DG2 and running an unsupported kernel, while I was able to occasionally reproduce this on Gen12 iGPUs and on a much more modern kernel version. I'm using whisper.cpp with its OpenCL backend to run arbitrary speech-to-text. If one thread ends up hanging, all other runtime threads also end up hanging, spinning multiple cores to 100%.

I'm very new to all of this so please let me know if there's any information I can supply :)

Host details: GPU: Arc A770 Arch Linux w/ kernel 6.7.3-arch1-1.1 intel-compute-runtime-23.48.27912.11-1

backtrace.txt

tazz4843 avatar Feb 09 '24 16:02 tazz4843

Looking at the backtrace:

  • 8 "tokio-runtime-w" threads have yielded their execution in NEO::CommandStreamReceiver::baseWaitFunction()
  • 1 "scripty_stt_ser" thread is futex waiting worker closing in NEO::DrmGemCloseWorker::worker()
  • 1 "scripty_stt_ser" thread is Tokyo Rust code directly hanging in futex_wait() syscall

eero-t avatar Feb 13 '24 10:02 eero-t

I have the same issue when running openvino model server

geekboood avatar Feb 21 '24 16:02 geekboood

Sorry this took me so long to get back to.

Looking at the backtrace:

  • 1 "scripty_stt_ser" thread is Tokyo Rust code directly hanging in futex_wait() syscall

From what I've looked at the code, it seems that this runtime worker is waiting for compute runtime code to return thus making me think this is the issue. Disabling the OpenCL runtime and falling back to CPU makes this issue completely disappear, even after weeks of runtime, compared to usually at most 1 week before it locks up and starts spinning on CPU with OpenCL integration.

tazz4843 avatar Feb 21 '24 16:02 tazz4843