NETProvider icon indicating copy to clipboard operation
NETProvider copied to clipboard

Fixed potential deadlock in FbConnectionPoolManager

Open Howaner opened this issue 3 years ago • 6 comments

Our ASP.NET Core application has been crashing about once a week lately with a lot of open threads and a deadlock when using the FbConnection.OpenAsync() function.

I now found the following deadlock:

  1. Pool.PrunePool(): Parallel.ForEach(release, x => x.Release()); is the main deadlock cause.
  2. Release() can call the Pool.ReleaseConnection() method and this method locks the same lock
  3. So now, PrunePool() is already locking the _syncRoot lock and ReleaseConnection() wants to lock the _syncRoot lock
  4. But PrunePool() is waiting for Release() to complete => Deadlock

The FbConnectionInternal.Disconnect() method is already called in different places without locking the _syncRoot lock, so I assume it can be called outside the lock.

Howaner avatar Oct 14 '22 12:10 Howaner

@cincuranet Is there a chance to integrate this into upstream? Is there anything else I need to do?

Howaner avatar Nov 10 '22 13:11 Howaner

I need to review the PR first. Currently busy.

cincuranet avatar Nov 10 '22 13:11 cincuranet

@Howaner do you have a reproductible test case which shows the problem?

Maybe adding a test case to this PR?

fdcastel avatar Jul 11 '23 16:07 fdcastel

@fdcastel It is a race condition and therefore quite difficult to trigger. I tried to reproduce it in a test project today but I failed. The race condition occurs when using the Firebird library in an application where many things happen simultaneously. Our application crashed every few weeks due to this bug and since we changed this few lines, the deadlock has no longer occurred.

I'm really surprised that the bug doesn't already occur much more often with other people. Probably only a few people make many simultaneous requests with Firebird.

Howaner avatar Jul 12 '23 16:07 Howaner

I understand. Surely they are a hell to reproduce.

Thank you for you feedback. Much appreciated.

fdcastel avatar Jul 12 '23 16:07 fdcastel

I too have started hitting this issue in our test projects after updating to version 9.1.1.0. I'll try and see if I can write a minimal reproducible test case.

willibrandon avatar Jul 27 '23 17:07 willibrandon