breeze icon indicating copy to clipboard operation
breeze copied to clipboard

deadlock detected

Open jwoertink opened this issue 5 years ago • 6 comments

I went to run specs on my app locally, and I got a ton of deadlock errors:

Unhandled exception in spawn: deadlock detected (PQ::PQError)
  from lib/pg/src/pq/connection.cr:203:7 in 'handle_error'
  from lib/pg/src/pq/connection.cr:186:7 in 'handle_async_frames'
  from lib/pg/src/pq/connection.cr:162:7 in 'read'
  from lib/pg/src/pq/connection.cr:414:18 in 'expect_frame'
  from lib/pg/src/pq/connection.cr:398:9 in 'read_next_row_start'
  from lib/pg/src/pg/result_set.cr:39:8 in 'move_next'
  from lib/db/src/db/result_set.cr:39:13 in 'from_rs'
  from lib/avram/src/avram/save_operation.cr:367:17 in 'insert'
  from lib/avram/src/avram/save_operation.cr:349:7 in 'insert_or_update'
  from lib/avram/src/avram/save_operation.cr:297:9 in 'save'
  from lib/avram/src/avram/save_operation.cr:321:8 in 'save!'
  from lib/breeze/src/breeze/operations/save_breeze_sql_statement.cr:1:1 in 'create!:breeze_request_id:statement:args:model:elapsed_text'
  from lib/breeze/src/breeze.cr:14:3 in '->'
  from /usr/local/Cellar/crystal/0.36.1_2/src/primitives.cr:255:3 in 'run'
  from /usr/local/Cellar/crystal/0.36.1_2/src/fiber.cr:92:34 in '->'

I have it set to only enable while in development, so in theory breeze should be skipped during tests... I'll try to dig in more to see what specifically is causing this.

jwoertink avatar Mar 24 '21 18:03 jwoertink

Ok, my issue may not actually be breeze related directly... However, it does worry me that it was so easy to his this error. I'll leave it open for now so we can track it. Maybe someone can think of a fix.

jwoertink avatar Mar 24 '21 18:03 jwoertink

Ok, just ran in to something similar and I was doing something completely different here:

Unhandled exception in spawn:  (DB::ConnectionRefused)
  from Exception::CallStack::unwind:Array(Pointer(Void))
  from Exception::CallStack#initialize:Array(Pointer(Void))
  from Exception::CallStack::new:Exception::CallStack
  from raise<DB::ConnectionRefused>:NoReturn
  from PG::Connection#initialize<DB::Database>:Bool
  from PG::Connection::new<DB::Database>:PG::Connection
  from PG::Driver#build_connection<DB::Database>:PG::Connection
  from ~procProc(DB::Connection)@lib/db/src/db/database.cr:56
  from DB::Pool(DB::Connection+)@DB::Pool(T)#build_resource:DB::Connection+
  from DB::Pool(DB::Connection+)@DB::Pool(T)#checkout:DB::Connection+
  from DB::Database#checkout:DB::Connection+
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save:Bool
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save!:Breeze::BreezeSqlStatement
  from Breeze::SaveBreezeSqlStatement::create!:breeze_request_id:statement:args:model:elapsed_text<(Int64 | Nil), String, (String | Nil), (String | Nil), String>:Breeze::BreezeSqlStatement
  from ~procProc(Nil)@lib/breeze/src/breeze.cr:14
  from Fiber#run:(IO::FileDescriptor | Nil)
  from ~proc2Proc(Fiber, (IO::FileDescriptor | Nil))@/usr/local/Cellar/crystal/0.36.1_2/src/fiber.cr:92
Caused by: no PostgreSQL user name specified in startup packet (PQ::PQError)
  from Exception::CallStack::unwind:Array(Pointer(Void))
  from Exception::CallStack#initialize:Array(Pointer(Void))
  from Exception::CallStack::new:Exception::CallStack
  from raise<PQ::PQError>:NoReturn
  from PQ::Connection#handle_error<PQ::Frame::ErrorResponse>:NoReturn
  from PQ::Connection#handle_async_frames<(PQ::Frame+ | PQ::Frame::Unknown)>:Bool
  from PQ::Connection#read<(Char | Nil)>:(PQ::Frame+ | PQ::Frame::Unknown)
  from PQ::Connection#read:(PQ::Frame+ | PQ::Frame::Unknown)
  from PQ::Connection#expect_frame<PQ::Frame::Authentication.class, Nil>:PQ::Frame::Authentication
  from PQ::Connection#expect_frame<PQ::Frame::Authentication.class>:PQ::Frame::Authentication
  from PQ::Connection#connect:Bool
  from PG::Connection#initialize<DB::Database>:Bool
  from PG::Connection::new<DB::Database>:PG::Connection
  from PG::Driver#build_connection<DB::Database>:PG::Connection
  from ~procProc(DB::Connection)@lib/db/src/db/database.cr:56
  from DB::Pool(DB::Connection+)@DB::Pool(T)#build_reUnhandled exception in spawn:  (DB::ConnectionRefused)
  from Exception::CallStack::unwind:Array(Pointer(Void))
  from Exception::CallStack#initialize:Array(Pointer(Void))
  from Exception::CallStack::new:Exception::CallStack
  from raise<DB::ConnectionRefused>:NoReturn
  from PG::Connection#initialize<DB::Database>:Bool
  from PG::Connection::new<DB::Database>:PG::Connection
  from PG::Driver#build_connection<DB::Database>:PG::Connection
  from ~procProc(DB::Connection)@lib/db/src/db/database.cr:56
  from DB::Pool(DB::Connection+)@DB::Pool(T)#build_resource:DB::Connection+
  from DB::Pool(DB::Connection+)@DB::Pool(T)#checkout:DB::Connection+
  from DB::Database#checkout:DB::Connection+
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save:Bool
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save!:Breeze::BreezeSqlStatement
  from Breeze::SaveBreezeSqlStatement::create!:breeze_request_id:statement:args:model:elapsed_text<(Int64 | Nil), String, (String | Nil), (String | Nil), String>:Breeze::BreezeSqlStatement
  from ~procProc(Nil)@lib/breeze/src/breeze.cr:14
  from Fiber#run:(IO::FileDescriptor | Nil)
  from ~proc2Proc(Fiber, (IO::FileDescriptor | Nil))@/usr/local/Cellar/crystal/0.36.1_2/src/fiber.cr:92

It seems it's pretty easy to hit DB errors with this. In this case I was calling this code:

File.read_lines(filename).each do |domain|
      SaveRestrictedDomain.create(text: domain.strip) do |_o, _d|
        # ignore if it fails
      end
    end

This code was in a task that I was running locally, and filename is a file with about 120,000 lines in it. Looks like it's coming from this file. It doesn't really matter where the subscribe is, it's basically global and will run from anywhere once it's been defined. If I'm blasting my database, and it has to run this block on everyone, I'm assuming that the threads are just backing up and dogpiling because I'm pushing queries faster than this block can run. Too many spawn calls...

The original post was caused from me running specs where I was essentially doing the same. I was pushing more queries than what could be handled. These are probably edge cases, but the fact that they are preventing me from doing what I need to is an issue.

For now, I was able to get around the first part because when I ran specs earlier, it thought I was in development. For this case I am in development, but we have a Lucky::Env.task? method that I can use to disable breeze in tasks.

jwoertink avatar Mar 24 '21 23:03 jwoertink

So are you saying that you think it's because we are saving things to the database using spawn?

https://github.com/luckyframework/breeze/blob/5a98f72cd0d5282b6aeb71d1e02fe70ae7c06456/src/breeze/actions/mixins/action_helpers.cr#L16-L25

matthewmcgarvey avatar Mar 24 '21 23:03 matthewmcgarvey

That's my assumption, yeah. It seems like It's kicking off more spawns than doing saves, and they are piling up. I guess since it's easy to reproduce, I can take them out of the spawn to see if it still does it. I'll give that a shot tomorrow and see if that makes a difference.

jwoertink avatar Mar 25 '21 00:03 jwoertink

Here's at least one thing we connect to the current fiber https://github.com/luckyframework/avram/blob/5e29f75371dca5a2bc16858163dc3790cabfbfcb/src/avram/database.cr#L6

matthewmcgarvey avatar Mar 25 '21 01:03 matthewmcgarvey

I'm not too familiar with doing multi-threading stuff. I wonder if that holds us back in some way by restricting to a single thread 🤔

jwoertink avatar Mar 25 '21 16:03 jwoertink