Possible deadlock involving m_patternMutex in AutomationPattern
It seems I found a deadlock in AutomationPattern. Or rather, it found me, because I wasn't looking for it.
The setup was: TrippleOsc set on noise and with volume and cutoff envelopes active, playing in a BB pattern. While it was playing, I was holding the left mouse button on PianoView and wildly moving the cursor around, so randomly spamming a lot of notes. After a while everything froze. There was only one empty (unused) automation track in the project.
I attached gdb to the process, but I never tried to find the cause of a deadlock in code I don't know much about, so it took me a few hours (to re-learn parts of gdb and) to even realize which thread is the one truly making everything stuck. But even then, I can't see any way the deadlock could happen. The mutex in question is always used with QMutexLocker and it is recursive, so it can be called repeatedly within one thread.
Worse still, so far I wasn't able to reproduce the problem with the same build and the same setup, so at this point I'm willing to accept it just may have been caused by a cosmic ray or something.
If someone has a clue what could be happening and wants to dive into it, be my guest, I give up. Here are backtraces of the more interesting threads and some notes. Otherwise this may be closed just as a note for the future, that this is a thing that could potentially happen and we don't really know why.
thread 1: "lmms" (user interface)
#0 0x00007f9e841a8db5 in futex_wait_cancelable (private=0, expected=0, futex_word=0x55fff5d966c4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1 0x00007f9e841a8db5 in __pthread_cond_wait_common (abstime=0x0, mutex=0x55fff5d96670, cond=0x55fff5d96698) at pthread_cond_wait.c:502
#2 0x00007f9e841a8db5 in __pthread_cond_wait (cond=0x55fff5d96698, mutex=0x55fff5d96670) at pthread_cond_wait.c:655
#3 0x00007f9e81cab21b in QWaitCondition::wait(QMutex*, unsigned long) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#4 0x000055fff389b3bd in Mixer::requestChangeInModel() ()
#5 0x000055fff3a0ee89 in InstrumentTrack::processInEvent(MidiEvent const&, TimePos const&, int) ()
#6 0x000055fff395cd0a in PianoView::mouseMoveEvent(QMouseEvent*) ()
#7 0x00007f9e83ce04d8 in QWidget::event(QEvent*) () at /lib/x86_64-linux-gnu/libQt5Widgets.so.5
...
- Mixer::requestChangeInModel() is waiting in m_changesRequestCondition.wait( &m_waitChangesMutex );
- i.e. waiting for the current buffer to finish processing
- InstrumentTrack::processInEvent() did not lock anything before that
- neither did PianoView::mouseMoveEvent()
- the rest is Qt stuff
threads 7, 8, 9: mixer workers
#0 0x00007f9e841a8db5 in futex_wait_cancelable (private=0, expected=0, futex_word=0x55fff5dcdc10) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1 0x00007f9e841a8db5 in __pthread_cond_wait_common (abstime=0x0, mutex=0x55fff5dcdbc0, cond=0x55fff5dcdbe8) at pthread_cond_wait.c:502
#2 0x00007f9e841a8db5 in __pthread_cond_wait (cond=0x55fff5dcdbe8, mutex=0x55fff5dcdbc0) at pthread_cond_wait.c:655
#3 0x00007f9e81cab21b in QWaitCondition::wait(QMutex*, unsigned long) () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#4 0x000055fff38a0eb7 in MixerWorkerThread::run() ()
...
- waiting for something to do at queueReadyWaitCond->wait( &m );
thread 11: mixer output
#0 0x00007f9e819350b9 in syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1 0x00007f9e81c9d9d5 in QBasicMutex::lockInternal() () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#2 0x00007f9e81c9dcd3 in () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#3 0x000055fff382d536 in AutomationPattern::valueAt(TimePos const&) const ()
#4 0x000055fff38a525e in NotePlayHandle::processTimePos(TimePos const&) ()
#5 0x000055fff3a0d295 in InstrumentTrack::play(TimePos const&, short, int, int) ()
#6 0x000055fff383babd in BBTrackContainer::play(TimePos, short, int, int) ()
#7 0x000055fff3a08ec7 in BBTrack::play(TimePos const&, short, int, int) ()
#8 0x000055fff38d819f in Song::processNextBuffer() ()
#9 0x000055fff389df98 in Mixer::renderNextBuffer() ()
#10 0x000055fff389e494 in Mixer::fifoWriter::run() ()
...
- AutomationPattern::valueAt() is stuck at QMutexLocker m(&m_patternMutex);
- and there is no apparent reason why it shouldn't get the lock..
- so someone else must have it.. but the lock is private..
- and AutomationPattern isn't mentioned in any other thread backtrace..
- and the mutex is QMutex::Recursive, so thread 11 can lock it multiple times..
:man_shrugging:
Here is a core dump if anyone feels especially adventurous. core.7936.gz (~1.3 GB when extracted) EDIT: Ah, did not realize the debug symbols are in the binary: lmms.gz (Still, not sure if it can be easily loaded on another machine?)
I can't get any useful information from your core dump. Do you have the full bactrace with all threads?
I will try to export the backtrace once I can make some space to extract the core dump again.. :smile: (Yes, I'm hopeless, I'm almost always on full RAM and full disk, only out-of-space errors can usually force me to sacrifice some files. I should probably get a new drive...)
Here are the backtraces. If I can pull from it anything else that could help, let me know. (The most interesting information would be probably the owner(s) and other internal variables of the problematic mutex, but I wasn't able to get to it.) backtrace.txt
@sakertooth found something similar on discord looking at the VST automation issue.