ssh2 icon indicating copy to clipboard operation
ssh2 copied to clipboard

Fix thread-safety issues in native bindings replacing `Nan::Persistent` with `v8::Eternal`

Open rmnilin opened this issue 1 year ago • 3 comments

This PR addresses thread-safety issues in the native bindings of the library. It replaces Nan::Persistent with v8::Eternal for constructor storage in cipher and decipher classes.

The previous implementation using Nan::Persistent was not thread-safe, leading to crashes and undefined behavior when these classes were accessed from multiple threads simultaneously. This was particularly problematic when ssh2 was used in multi-threaded environments, such as when imported in worker threads.

Key changes:

  1. Modified the constructor method to return a v8::Eternal instead of a Nan::Persistent.
  2. Updated the init function to use the Set method of the Eternal handle.

These changes ensure thread-safe storage of constructor handles without altering the intended functionality of the modules.

This PR addresses issue #1393 and resolves the reported crashes and segmentation faults related to multi-threaded use of the ssh2 library.

rmnilin avatar Sep 07 '24 13:09 rmnilin

I don't understand how using a v8::Eternal fixes a thread safety issue. AFAIK the only real difference between v8::Eternal and v8::Persistent is the lifetime of the value. v8::Eternal will always last the lifetime of the isolate whereas v8::Persistent will not necessarily.

mscdex avatar Sep 07 '24 14:09 mscdex

Is the underlying reason for the worker-related issues that constructor() returns a static v8::Persistent?

mscdex avatar Sep 07 '24 15:09 mscdex

Thank you for the feedback!

Yes, the underlying issue with worker threads is related to the use of a static Nan::Persistent. This static handle could lead to race conditions and undefined behavior in a multi-threaded environment. By switching to v8::Eternal, we ensure that the constructor handle remains stable and safe across multiple threads, as v8::Eternal ties the handle's lifetime to the isolate, avoiding the problems associated with the static storage duration of Nan::Persistent.

While the primary distinction between Nan::Persistent and v8::Eternal is indeed the lifetime, v8::Eternal provides a more reliable guarantee for thread safety in multi-threaded environments, addressing the described issues.

I've also created rmnilin/ssh2-1393-repro to test both the current implementation and this fix.

rmnilin avatar Sep 07 '24 15:09 rmnilin

Hello, any update on this PR ? We can't restart dynamically a worker because of this bug.

MrMagne avatar Oct 10 '24 11:10 MrMagne

I guess @mscdex just doesn't have enough time. So, a gentle reminder 🙏

rmnilin avatar Oct 11 '24 11:10 rmnilin

Thank you so much for fixing this <3 It crashed our servers

abuaboud avatar Dec 04 '24 21:12 abuaboud

just ran into this issue as well... really need to merge these changes in if ssh2 is to be used in serious production environments

0xAl3xH avatar Feb 01 '25 17:02 0xAl3xH

@rmnilin Can you include a test in this PR? Other than that, I am fine with these changes I guess.

mscdex avatar Feb 01 '25 19:02 mscdex

@mscdex Added the test 🤝

rmnilin avatar Feb 02 '25 02:02 rmnilin

@rmnilin It looks like the new test will need to be skipped for older node versions. I think the best way to do that is to wrap the require('worker_threads') in a try-catch and exit early if ex.code === 'MODULE_NOT_FOUND' otherwise re-throw.

EDIT: There is also a lint error that needs to be fixed

mscdex avatar Feb 02 '25 04:02 mscdex

@mscdex I hadn't considered Node.js versions without the worker threads module. Thanks for pointing that out. I've also fixed the lint errors, and everything should be in order now.

rmnilin avatar Feb 05 '25 17:02 rmnilin

Missed another one lint issue, fixed again...

rmnilin avatar Feb 05 '25 18:02 rmnilin

Thanks, landed in dd5510c0888476e9eec205273732473fcce5adc6.

mscdex avatar Feb 05 '25 19:02 mscdex