OpenComputers icon indicating copy to clipboard operation
OpenComputers copied to clipboard

Fix server-side deadlock in filesystem activity handling

Open ds84182 opened this issue 7 years ago • 4 comments

We've managed to hit a reproducible deadlock concerning filesystems. The exact setup is unknown, but using finer lock scopes should fix the issue.

I'm not able to test this change at the moment (I don't have MCP setup), and I can't verify if the change is valid Scala (but it seems like it should be fine).

ds84182 avatar Dec 28 '18 17:12 ds84182

Where/how exactly would the previous setup deadlock? I can't really see how using two locks instead of one would fix the issue. This at least ensures that there is only one event scheduled at a time.

Vexatos avatar Dec 28 '18 18:12 Vexatos

-snip- <same information as ds84182 posted below>

notcake avatar Dec 28 '18 19:12 notcake

@Vexatos Essentially, we have a deadlock between 3 threads: The server thread, and two OC computer threads.

The server thread was in the middle of starting a computer and sending a computer.start network message. Code This attempts to lock a Filesystem component for network message deliver, but the Filesystem component is currently locked by the first OC thread.

The first OC thread was in the middle of attempting to send a filesystem activity packet, in sendFileSystemActivity. However, the fileSystemAccessTimeouts map is locked by the second OC thread, which was also in the middle of sendFileSystemActivity, for a different Filesystem.

The second OC thread, which was also in the middle of attempting to send a filesystem activity packet, entered FileSystemAccessHandler via the Forge eventbus. This caused a chain reaction where calling markChanged on the associated server rack caused a cascade that eventually got to Machine.isRunning, which tries to lock the state of the machine being started on the server thread.

Stack trace dumps available here:

Server thread

First OpenComputers thread

Second OpenComputers thread

The main issue comes from the lock on fileSystemAccessTimeouts.

ds84182 avatar Dec 28 '18 19:12 ds84182

Uh?

asiekierka avatar Jun 29 '19 07:06 asiekierka

This has been open for long enough.

asiekierka avatar Jun 03 '23 14:06 asiekierka