rust icon indicating copy to clipboard operation
rust copied to clipboard

Edition 2027: Consider replacing `std::sync::Mutex` and other locks with `std::sync::nonpoison` versions

Open tgross35 opened this issue 1 month ago • 23 comments

This serves as a place to discuss changing default lock types to their nonpoison versions, which has come up a number of times before but not yet (to my knowledge) discussed in depth.

@davepacheco brings up a good point at https://github.com/rust-lang/rust/issues/134645#issuecomment-3583159102 that having posioning locks tends to be a safer default, and that the .unwrap() annoyance could be smoothed with panicking methods rather than changing out the type.

I'm not sure whether or not an RFC would be required but considering the impact, it doesn't seem like a bad idea (see also https://github.com/rust-lang/rfcs/pull/3550 for a similar change, though that involved lang).

Related:

  • Tracking issue for adding the types: https://github.com/rust-lang/rust/issues/134645
  • Tracking issue for the modules: https://github.com/rust-lang/rust/issues/134646

tgross35 avatar Nov 26 '25 21:11 tgross35

One adjacent topic that IMO is worth considering is panic=abort. I work on a codebase that's largely compiled with panic=abort so the poisoning API doesn't offer any value to those programs. Given https://smallcultfollowing.com/babysteps/blog/2024/05/02/unwind-considered-harmful/ and other discussions I wonder whether it's worth considering the frequency of poisoned locks even being observable in users' programs in practice.

anp avatar Nov 26 '25 21:11 anp

and that the .unwrap() annoyance could be smoothed with panicking methods rather than changing out the type.

I would be 100% in favor of this! If anything I did rather make .lock() panic on poison by default (with a different lock method returning a result that you can handle) than remove poison from the default mutex type. Without poisoning you are way too likely to keep limping around without making any forward progress ever again when an essential background thread panicked.

bjorn3 avatar Nov 26 '25 22:11 bjorn3

As a quick example, here's code which would panic with the current behavior, but would be UB (dereferencing a null pointer) with non-poisoning mutexes:

use std::sync::Mutex;

/// The mutex protects the invariant that the pointer is always valid
struct PointerWrapper(Mutex<*mut u32>);

const SHOULD_PANIC: bool = false;

fn poke_pointer(p: &PointerWrapper) {
    let mut p = p.0.lock().unwrap();
    // Briefly break invariants to manipulate the pointer
    let prev = std::mem::replace(&mut *p, std::ptr::null_mut());
    let next = do_something_that_could_panic(prev);
    *p = next;
}

fn do_something_that_could_panic(ptr: *mut u32) -> *mut u32 {
    if SHOULD_PANIC {
        panic!("oh no");
    }
    ptr
}

fn main() {
    let mut i = 123u32;
    let p = PointerWrapper(Mutex::new(&mut i));
    let _ = std::panic::catch_unwind(|| poke_pointer(&p));
    
    let value = p.0.lock().unwrap();
    println!("{}", unsafe { **value });
}

This is obviously a contrived example – and one could argue that updating to remove the .unwrap() would force people to consider the edition changes – but in a large codebase, the path of least resistance would be for people to remove the unwrap() and go along with their day, unaware that the std::sync::Mutex behavior has changed underneath them.

(FWIW, I also like the idea of making .lock() panic on poisoning, which could happen at an edition boundary!)

mkeeter avatar Nov 26 '25 22:11 mkeeter

It's worth noting that if lock() is changed to always panic on poisoning at an edition boundary, it should probably only do so if !std::thread::panicking(); I've seen plenty of code where PoisonErrors are always unwrapped except in a Drop impl, to avoid recursive panics causing the program to immediately abort.

One downside of making the API unconditionally panic rather than return a Result is how it effects this case, actually. If poisoning is ignored when the thread is already panicking, then the Drop implementation can witness the potentially-violated invariants. I've seen code where Drop implementations that do something not required for memory safety (i.e. updating timing data or counters or similar that are inside of a lock) just only do that action when lock() returns Ok, and skip it if the lock has already been poisoned and the Drop impl was called in the midst of unwinding. Changing the lock() API to unconditionally panic instead does make that behavior unexpressable.

hawkw avatar Nov 26 '25 22:11 hawkw

Changing the lock() API to unconditionally panic instead does make that behavior unexpressable.

We should still keep the non-panicking variant of the lock method around IMO. Just with a different name.

bjorn3 avatar Nov 26 '25 22:11 bjorn3

This will need to be discussed by the team at some point, may as well nominate it now.

@rustbot label +I-libs-api-nominated

tgross35 avatar Nov 27 '25 07:11 tgross35

it should probably only do so if !std::thread::panicking();

The situation is a bit more complicated. See https://github.com/rust-lang/rust/issues/143612

theemathas avatar Nov 27 '25 09:11 theemathas

Without poisoning you are way too likely to keep limping around without making any forward progress ever again when an essential background thread panicked.

I've also heard of (haven't personally seen) situations where poisoning does the opposite, putting the program into a permanent bad state when it would've been fine otherwise. Consider a service that spawns a thread for each connection and never joins them. If one request panics while it happens to hold an important lock, poisoning could cause ~all requests after that to panic. Then the service would be failing all requests until some watchdog happened to notice and reboot it. I think @mitsuhiko had a story about this somewhere.

oconnor663 avatar Nov 28 '25 01:11 oconnor663

There are some existing discussions about the future of UnwindSafe which is related to poisoning. IIUC, part of the reason why poisoning exists is to make Mutex: RefUnwindSafe, see docs of UnwindSafe. XRef: https://github.com/rust-lang/libs-team/issues/273 (cc @m-ou-se) and https://github.com/rust-lang/rfcs/pull/2871 (cc @Diggsey)

Also I noticed that the current RefCell impl, which is akin to a single-threaded Mutex[^1], does not have poisoning, thus cannot "protect" user from forgetting unwinding exists during temporary invariant violation. I have not found any user who complains about that, which is definitely less than the number of users complaining about poison-by-default behavior of mutex.

[^1]: RwLock, to be accurate.

@mkeeter

As a quick example, here's code which would panic with the current behavior, but would be UB (dereferencing a null pointer) with non-poisoning mutexes:

It's not "would be" but "is already" a UB. It is mentioned in #143471 that the poison behavior is on best-effort basis and must NOT be relied for safety purpose. To prove it's unsound, we can practically make it segfault from a totally safe caller, by using a similar method given in that issue: playground.

oxalica avatar Dec 04 '25 01:12 oxalica

I feel like the choice between sync::poison::Mutex and sync::nonpoison::Mutex is ultimately down to development philosophy. If a consumer of the mutex panics, is handling the aftermath the responsibility of the panicing consumer (use non-poisoning mutex) or the other consumers (use poisoning mutex).

FWIW, I personally prefer the option where the it is the responsibility of the panicing consumer. I feel like that leads to better isolation between the consumers and more clearly defined error boundaries.

FeldrinH avatar Dec 04 '25 02:12 FeldrinH

Also I noticed that the current RefCell impl, which is akin to a single-threaded Mutex, does not have poisoning, ... I have not found any user who complains about that,

That doesn't seem too surprising to me? I would expect most RefCell users have single-threaded/non-async programs that just die on panic. All the mentions of carrying on with a non-poisoning Mutex I've seen are in the context of things like web frameworks where you at least have tasks if not multiple threads. This seems more like evidence catch_unwind() is relatively rare outside frameworks?

Your search about mutex poisoning complaints is very confusing to me. The top results are issues around this, followed by an issue about lock ordering, https://github.com/rust-lang/libs-team/issues/273, a RwLock bug, etc. Not complaints about mutex poisoning? But I wouldn't expect to see those issues against rust-lang/rust anyway. @dap linked the 2020 poisoning survey results in the other thread before this issue was created. There, many responses were along the lines of "I use non-std locking primitives for performance" (before Rust used futex), majority "I want the program to terminate". The results read to me like .lock().unwrap() is annoying (yes!), that people would like .lock() to unwrap on poison (me too!), but that the population that wants non-poisoning mutexes is a (large) minority.

From #143471:

Separately, we are also happy to just consider poisoning "best effort". It should therefore not be relied on by unsafe code for soundness. The documentation will have to be updated to reflect that.

Well, there are some four year old functions in repos I no longer have commit bit in that probably need updating. I assume this would apply to a Poison<T> as well. It seems to me by not using resume_unwind and only barely using catch_unwind I've come to expect stronger guarantees about mutexes and invariants than Rust can offer. I'm now not really sure how to correctly, in general, maintain invariants in a Mutex<T> used in a library.


Separately, I ended up doing a bit of a survey of mutex semantics in the face of exceptions/panic/etc in other languages. I'd written this in #134645 to emphasize how nonpoison seems easy to misuse, but I think it was interpreted as an argument against nonpoison-as-std::sync. Maybe it's a bit of both :) But in case it's interesting here:

unlocking on panic/exception is the standard behavior of mutexes in most other languages

Java is the example I cited in comparison to nonpoison. In C#, Python, and Go, these are not the standard library semantics. with lock is substantially different than just "what threading.Lock() returns". Some examples, but they get off the Rust topic so I've details'd:

C# from MSDN:

If a thread terminates while owning a mutex, the mutex is said to be abandoned. The state of the mutex is set to signaled, and the next waiting thread gets ownership. Beginning in version 2.0 of the .NET Framework, an AbandonedMutexException is thrown in the next thread that acquires the abandoned mutex. Before version 2.0 of the .NET Framework, no exception was thrown.

This is much closer to poisoning than nonpoison. Unlike nonpoison, there is a very explicit notification that the mutex was abandoned. The following paragraph gets to the point made by several people:

An abandoned mutex often indicates a serious error in the code. When a thread exits without releasing the mutex, the data structures protected by the mutex might not be in a consistent state. The next thread to request ownership of the mutex can handle this exception and proceed, if the integrity of the data structures can be verified.

Python
import threading

lock = threading.Lock()

try:
    print("Acquired? {}".format(lock.acquire()))
    raise Exception("boom")
except Exception as e:
    print(f"caught exception: {e}")
    print("is the mutex still locked? {}".format(lock.locked()))

print("and after the try block it is locked? {}".format(lock.locked()))
lock.release()

with lock:
    print("with automatically locks and unlocks the lock?")
print(lock.acquire())

when run, prints

Acquired? True
caught exception: boom
is the mutex still locked? True
and after the try block it is locked? True
with automatically locks and unlocks the lock?
True

so if your point is that lock has __enter__ and can be used with with, then yeah I guess? But with lock is a very different construct than lock.acquire()/lock.release(). It's much easier to tell how critically to read the enclosing function.

Go

https://go.dev/play/p/iIhnFGOw2tQ or, for reference,

import (
	"fmt"
	"sync"
)

func main() {
	panicful()
}

func panicful() {
	var l sync.Mutex

	l.Lock()

	defer func() {
		if err := recover(); err != nil {
			var locked = l.TryLock()
			fmt.Printf("Recovered from panic. Could I lock the mutex again? %t", locked)
		}
	}()

	fmt.Println("Locked before panic..")

	panic("bad thing happened")
}

Again, you can defer l.Unlock() but that is at least a bit confusing and in the discussion about that many people explicitly say do not recover from panics, and the corresponding argument in Rust would be do not recover in catch_unwind. That avoids the problem, yes! And it does not relate to the state of the mutex at time of panic, because you're taking down the program.

C++

C++ guidance includes "Never call unknown code while holding a lock" (which is more about protecting from calling code which locks yourself and deadlocks, but still)

More relevant is that C++ has different primitives and tools to handle these circumstances. ERR56-CPP. Guarantee exception safety would suggest that among other things you should try/catch if your critical section may involve exceptions. CON51-CPP says to ensure that locks are released given exceptions, but I assume it is understood that the intersection of ERR56 and CON51 would say "you must not use lock_guard if an exception could leave data in an inconsistent state".

I would argue that keeping the mutex locked on panic is a just a worse version of the current poisoning behavior

I agree. I think it is safer than unlocking the mutex with protected data in unknown state. Especially since there is no way to discover the mutex was "abandoned", to borrow the C# language. I don't think either option is good. I just want std::sync::nonpoison::* to be clear to readers what they must consider when using these primitives.

I fully empathize with .lock().unwrap() being cumbersome, ...

I feel like this statement is indicative of some crossed wires in this discussion

No, it is indicative of mixed messaging about the motivation for nonpoison and the desired resting state for std::sync::Mutex. The author of the ACP asserts that the need for poison detection is low, but opens mentioning that the (undesirable) option is to unwrap. I understand your mention of annoyance as meaning unwrap() is annoying, not that you regularly call PoisonError::into_inner as well. If std::sync::Mutex::lock() did not return Result<>, I don't think there would be much interest in std::sync::nonpoison.

allow other consumers to keep working

This is essentially the point of several people here. If your server continues returning 200 OK, but a database connection has a half-completed transaction in-flight and is responding incorrectly, would you consider that working? Is returning incorrect results better than aborting the process and letting whatever monitor restart the server?

iximeow avatar Dec 04 '25 10:12 iximeow

Just want to add one point about remove poisoning vs .lock().unwrap() is annoying search numbers comparison.

As I see it, there are few groups of people:

  1. don't know or forgot about poisoning
  2. heard about poisoning but don't understant its impact
  3. understand poisoning or got bitten by it
    1. want it removed as it is not worth/possible to handle it
    2. try to handle (gracefully or some other way) all occurances of poisoning

From my experience I'd say group 1 and 2 are the majority of developers. And taking to account that vast majority of code examples out there are just using .lock().unwrap() without any graceful handling whatsoever doesn't help here either. For them unwrap is just an annoyance until they get unexpected poison-related problems. And lets be frank, these problems are not that common.

I'm not advocating for 3.1 or 3.2 here, just want to point out that we can't just make assumptions from the search results alone.

MatrixDev avatar Dec 04 '25 15:12 MatrixDev

If you want to look for degradation of services specifically that involve poisoning you can look for places that combine a) catch_unwind and b) allocate a mutex with .lock().unwrap() on some shared resource such as some server state, config object, connection pool etc.

You might also have some success unearthing some cases by looking for PoisonError specifically in GitHub issues and see what the failure condition is.

As for things that use catch_unwind it's basically request handlers in web frameworks etc.

mitsuhiko avatar Dec 04 '25 21:12 mitsuhiko

@MatrixDev I apologize in advance if I've misunderstood your comment but I'm not sure how you're accounting for people who understand poisoning, always use lock().unwrap(), and want the resulting panic propagation behavior. That is essentially what the current docs explicitly say to do (and it matches what a lot of folks I know do).


I don't think it's controversial for the Rust standard library to provide both types of mutexes. I understand the decision point to be which one is std::sync::Mutex. This affects two things: (1) who has to migrate a bunch of code if/when the breaking change happens, and (2) what behavior people get when they're not paying attention. Right?

In terms of (1): if the default is changed, everyone who currently cares about poisoning has to migrate their code (and has no way to ensure that their dependencies -- who may not wade into this or care at all -- do the same). If the default is not changed, nobody has to migrate any code because people who want non-poisoning mutexes are already using one (or a workaround).

Much has already been said about (2). To me, mutex poisoning feels a lot like borrow checking or array bounds checking. The borrow checker is an annoyance at first and array bounds checking has a runtime cost, but Rust embraces these things because they make programs safer out-of-the-box. In terms of getting rid of mutex poisoning by default: it feels like a step backwards to allow people to introduce this particular class of state corruption bug without them realizing it. I do get that folks have run into the opposite problem where poisoning is the behavior that breaks the program. So the conclusion is that with either choice, a program can become broken-but-alive after a panic. But explicitly producing a clear error is a much, much safer way for the program to fail than to read invalid state and drive on!

davepacheco avatar Dec 05 '25 06:12 davepacheco

Much has already been said about (2). To me, mutex poisoning feels a lot like borrow checking or array bounds checking. The borrow checker is an annoyance at first and array bounds checking has a runtime cost, but Rust embraces these things because they make programs safer out-of-the-box.

I'd argue that poisoning is nothing like borrow checking or array bound checking, both of which are mandatory safety checks. Poisoning is a best-effort warning system to catch users errors or traps as I explained above. I think a better analogy here is the overflow checking.

The logic errors that poisoning is trying to catch exist no matter if poisoning mechanism exists. Wrapping a Poisoning<T> will not automatically fix anything but only make alert louder. This can indeed be better for diagnostics, but I concern that a dedicate Poisoning<T> may give user a incorrect assumption that: they are protected by poisoning so they can simply ignore unwinding path at all. No they cannot, and there seems to be many people in the discussion already assuming it. Personally, I treat a mechanism itself, giving an incorrect impression on what it solves, a bigger trap. Because then everyone would use it incorrectly without even knowing. [^1]

[^1]: If poisoning is guaranteed to work, I'll be more happy to have a Poisoning<T>. But there seems to be technical difficulties on its reliability. See #143471

oxalica avatar Dec 05 '25 08:12 oxalica

giving an incorrect impression on what it solves, a bigger trap

It's worth noting that until @purplesyringa (thank you!) adjusted the documentation on std::sync::Mutex in the last six months, those docs are what gave the incorrect impression. For the last ~eleven years I've used Rust the docs read as a much stronger promise. While I have worked with foreign exceptions unwinding through Rust, it wasn't while std::sync was involved. I don't think it's a misunderstanding derived solely from lock() -> Result.

And to continue the overflow analogy, lock() implicitly unwrapping if poisoning is detected is not too dissimilar from panic-in-debug/wrap-in-release, so I still think that that's a better API for everyone. Particularly if nonpoison is also available.

but I concern that a dedicate Poisoning<T> may give user a incorrect assumption

part of the solution here does have to be "write down the contract and trust users will read it" though - I'd say that std::sync::nonpoison may give users an incorrect impression that if the mutex is unlocked then their critical sections ran to completion and the protected data is in a known state. Additionally, "Poison<T> works as long as these conditions are upheld" or "std::sync::Mutex will poison on unwrap as long as ..." could be very useful for making statements about soundness, but this seems to be a stronger stance on those primitives than Rust might take right now.

iximeow avatar Dec 05 '25 10:12 iximeow

In terms of (1): if the default is changed, everyone who currently cares about poisoning has to migrate their code (and has no way to ensure that their dependencies -- who may not wade into this or care at all -- do the same).

I've seen dependencies mentioned a couple of times in this discussion, but I'm not sure I understand what people are talking about. Are you talking about dependencies that expose the standard library mutex API as part of their public API? If so, do you have any examples of this? If not, why do you care if the dependencies use lock poisoning internally?

FeldrinH avatar Dec 05 '25 13:12 FeldrinH

In terms of (1): if the default is changed, everyone who currently cares about poisoning has to migrate their code (and has no way to ensure that their dependencies -- who may not wade into this or care at all -- do the same).

I've seen dependencies mentioned a couple of times in this discussion, but I'm not sure I understand what people are talking about. Are you talking about dependencies that expose the standard library mutex API as part of their public API? If so, do you have any examples of this? If not, why do you care if the dependencies use lock poisoning internally?

I'm talking about internal mutexes. If a dependency intentionally chooses non-poisoning mutexes, that's one thing. But if the dependency author doesn't care (as we've been assuming most Rust developers don't), and they originally wrote code using lock().unwrap()., then they do the easy thing across an edition change (drop the unwrap()), then I do care that it gets switched from using poisoning to not using poisoning. That's for all the reasons stated in many places in this thread: in my view, poisoning helps constrain the damage that can arise from bugs that cause a program to panic. My dependencies have bugs, too.

In this comment on std switching to non-poisoning, part of the discussion was "we are careful to maintain invariants when we may panic while holding a lock". The author who doesn't care about any of this has not been similarly careful.

davepacheco avatar Dec 05 '25 14:12 davepacheco

I'm talking about internal mutexes. If a dependency intentionally chooses non-poisoning mutexes, that's one thing. But if the dependency author doesn't care (as we've been assuming most Rust developers don't), and they originally wrote code using lock().unwrap()., then they do the easy thing across an edition change (drop the unwrap()), then I do care that it gets switched from using poisoning to not using poisoning.

I don't know. If I trust someone enough to use their dependency, then I trust them to use the correct kind of mutex for their use case. If I don't trust someone to use the correct kind of mutex, then I don't trust them enough to use their dependency.

FeldrinH avatar Dec 06 '25 00:12 FeldrinH

I'm talking about internal mutexes. If a dependency intentionally chooses non-poisoning mutexes, that's one thing. But if the dependency author doesn't care (as we've been assuming most Rust developers don't), and they originally wrote code using lock().unwrap()., then they do the easy thing across an edition change (drop the unwrap()), then I do care that it gets switched from using poisoning to not using poisoning.

I don't know. If I trust someone enough to use their dependency, then I trust them to use the correct kind of mutex for their use case. If I don't trust someone to use the correct kind of mutex, then I don't trust them enough to use their dependency.

This seems inconsistent with the rest of this discussion. If we trust dependency authors to use the correct kind of mutex for their use case, does it not follow that anybody using std::sync::Mutex has determined that a poisoning mutex is the correct mutex for their use case? Then it would make no sense to change std::sync::Mutex to be non-poisoning, as that could only break people, not help anybody.

davepacheco avatar Dec 06 '25 01:12 davepacheco

If we trust dependency authors to use the correct kind of mutex for their use case, does it not follow that anybody using std::sync::Mutex has determined that a poisoning mutex is the correct mutex for their use case? Then it would make no sense to change std::sync::Mutex to be non-poisoning, as that could only break people, not help anybody.

As of right now sync::nonpoison::Mutex isn't even stable, so I suspect most people are currently using the poisoning mutex simply because it is the only option in the standard library on stable Rust. I don't think it's possible to make these kinds of conclusions just yet.

FeldrinH avatar Dec 06 '25 02:12 FeldrinH

Worth noting that stable has access to non-poisoning locks, parking_lot Mutex/RwLock unlocks on panic.

DianaNites avatar Dec 06 '25 02:12 DianaNites

That's not a fair comparison, parking_lot is not std. People may choose Mutex over parking_lot Mutex just because they do not want to add a dependency or even because they never learned about the difference on poisoning outside the std (and simply use whatever std gives them.)

slanterns avatar Dec 06 '25 16:12 slanterns

This is a terrible idea. Non-poisoning locks that provide &mut to their data are UNSOUND relatively to preserving program invariants.

Consider the classic case of the bank account transfer:

fn transfer(bank: &mut Bank, from: Account, to: Account, amount: Num) {
  bank.amounts[from] -= amount;
  call_something();
  bank.amounts[to] += amount;
}

In normal code, this method guarantees that the total amount in the bank accounts will be preserved, because that obviously happens if call_something does not panic, and if it does panic, then bank is guaranteed to be no longer accessible.

But with non-poisoning mutexes giving &mut (or the other misfeature called AssertUnwindSafe, as well as into_inner() for poisoning mutex errors) then it's possible that call_something() panics, but the bank is still accessible because it was stored in a non-poisoning mutex!

Instead, non-poisoning mutexes methods giving &mut and all those misfeatures need to be deprecated and linted against, and eventually removed. The non-poisoning mutex code itself is still useful, but only after removing the methods giving out the &muts.

If performance optimization is desidered, then a mutex that doesn't unlock on panic, thus safely resulting in a deadlock on any future lock taking attempt, is the correct solution since it both removes the poisoning overhead and preserves soundness relatively to preserving invariants.

If making multithreading programs more resilient is desired, then the proper solution is to avoid holding shared Mutex over things that may panic. If that's not possible, then a non-poisoning Mutex that only provides an & to their inner data can be used for the synchronization, and then a nested poisoning mutex or RefCell can be locked for the actual mutations. This will not prevent the race condition, but the danger will be clear, because the code will have a lock around individual fine-grained mutations, making it clear than a panic or in the middle is something to consider.

So the only sound relative to program invariants preservation non-poisoning mutexes are those that only provide an & and not an &mut to their inner data, and due to that weaker API they cannot replace the current mutexes.

However, the poisoning mutexes can be reimplemented in terms of non-poisoning mutexes giving & to their contents plus a RefCell inside (which must be changed to poison on panic) to provide the poisoning inside.

lyphyser avatar Dec 08 '25 12:12 lyphyser

You’d also need to remove scoped threads:

fn main() {
    let mut x = 0;
    std::thread::scope(|s| {
        _ = s.spawn(|| {
            x += 1;
            panic!();
            x -= 1;
        }).join();
    });
    dbg!(x); // 1
}

Rust doesn’t really track unwind safety. UnwindSafe is inherently just a lint, even if you remove AssertUnwindSafe. Actual unwind safety would be a whole separate project that would require much more than mutex poisoning.

GoldsteinE avatar Dec 08 '25 13:12 GoldsteinE

@lyphyser I believe https://github.com/rust-lang/rust/issues/149359#issuecomment-3609527026 also addresses this (in addition to what @GoldsteinE said). So the only way to make everything perfectly sound is to just remove a bunch of useful things (which I'm sure we do not want) or bake it into the type system somehow ☹️.

connortsui20 avatar Dec 08 '25 13:12 connortsui20

You’d also need to remove scoped threads:

fn main() { let mut x = 0; std::thread::scope(|s| { _ = s.spawn(|| { x += 1; panic!(); x -= 1; }).join(); }); dbg!(x); // 1 } Rust doesn’t really track unwind safety. UnwindSafe is inherently just a lint, even if you remove AssertUnwindSafe. Actual unwind safety would be a whole separate project that would require much more than mutex poisoning.

Good point. Scoped threads also need to be fixed (via new methods + deprecation) so that std::thread::scope panics at the end if any of the scoped threads panicked.

catch_panic can be used to prevent the propagation, and then unwind safety will prevent that construct.

It seems to me that Rust is already mostly unwind safe and making it fully is just a matter of fixing relatively few issues like those with suitable deprecations, and obviously not doing non-poisoning &mut mutexes which goes in the opposite direction.

lyphyser avatar Dec 08 '25 14:12 lyphyser

In fact, the whole point of having panics vs Result and ? is not having to worry about exception safety everywhere like in C++, which is a source of program incorrectness, but only where a ? symbol is present in the code.

If one needs to worry about panics breaking invariants everywhere, then there is no point in having ? to mark possible exception points and one might as well use implicit exception propagation for all errors cases like C++ via either implicit ? on Result or exception unwinding (which are semantically the same, but have a different performance tradeoff)

lyphyser avatar Dec 08 '25 14:12 lyphyser

It seems to me that Rust is already mostly unwind safe and making it fully is just a matter of fixing relatively few issues like those with suitable deprecations, and obviously not doing non-poisoning &mut mutexes which goes in the opposite direction.

This would be a very significant, semver-breaking change. You are also underestimating the complexity: if you want to keep scoped threads or catch_unwind at least in some way, you need effects, which is non-trivial to integrate into an existing language. It is unclear to me why you believe changing so much about Rust is a better option than making mutexes consistent with the rest of the system that hasn't been changed since 1.0 and has always allowed non-poisoning behaviors. It might be a good idea for a new programming language or Rust 2.0, but I don't see a path from the current Rust to unwind-safe Rust.

purplesyringa avatar Dec 08 '25 15:12 purplesyringa

also FYI: https://github.com/rust-lang/rust/issues/134645#issuecomment-3590942770

slanterns avatar Dec 08 '25 16:12 slanterns