Fix thread-safety issues in native bindings replacing `Nan::Persistent` with `v8::Eternal`
This PR addresses thread-safety issues in the native bindings of the library. It replaces Nan::Persistent with v8::Eternal for constructor storage in cipher and decipher classes.
The previous implementation using Nan::Persistent was not thread-safe, leading to crashes and undefined behavior when these classes were accessed from multiple threads simultaneously. This was particularly problematic when ssh2 was used in multi-threaded environments, such as when imported in worker threads.
Key changes:
- Modified the constructor method to return a
v8::Eternalinstead of aNan::Persistent. - Updated the init function to use the
Setmethod of theEternalhandle.
These changes ensure thread-safe storage of constructor handles without altering the intended functionality of the modules.
This PR addresses issue #1393 and resolves the reported crashes and segmentation faults related to multi-threaded use of the ssh2 library.
I don't understand how using a v8::Eternal fixes a thread safety issue. AFAIK the only real difference between v8::Eternal and v8::Persistent is the lifetime of the value. v8::Eternal will always last the lifetime of the isolate whereas v8::Persistent will not necessarily.
Is the underlying reason for the worker-related issues that constructor() returns a static v8::Persistent?
Thank you for the feedback!
Yes, the underlying issue with worker threads is related to the use of a static Nan::Persistent. This static handle could lead to race conditions and undefined behavior in a multi-threaded environment. By switching to v8::Eternal, we ensure that the constructor handle remains stable and safe across multiple threads, as v8::Eternal ties the handle's lifetime to the isolate, avoiding the problems associated with the static storage duration of Nan::Persistent.
While the primary distinction between Nan::Persistent and v8::Eternal is indeed the lifetime, v8::Eternal provides a more reliable guarantee for thread safety in multi-threaded environments, addressing the described issues.
I've also created rmnilin/ssh2-1393-repro to test both the current implementation and this fix.
Hello, any update on this PR ? We can't restart dynamically a worker because of this bug.
I guess @mscdex just doesn't have enough time. So, a gentle reminder 🙏
Thank you so much for fixing this <3 It crashed our servers
just ran into this issue as well... really need to merge these changes in if ssh2 is to be used in serious production environments
@rmnilin Can you include a test in this PR? Other than that, I am fine with these changes I guess.
@mscdex Added the test 🤝
@rmnilin It looks like the new test will need to be skipped for older node versions. I think the best way to do that is to wrap the require('worker_threads') in a try-catch and exit early if ex.code === 'MODULE_NOT_FOUND' otherwise re-throw.
EDIT: There is also a lint error that needs to be fixed
@mscdex I hadn't considered Node.js versions without the worker threads module. Thanks for pointing that out. I've also fixed the lint errors, and everything should be in order now.
Missed another one lint issue, fixed again...
Thanks, landed in dd5510c0888476e9eec205273732473fcce5adc6.