cats-effect icon indicating copy to clipboard operation
cats-effect copied to clipboard

Allow overriding how fatal errors are printed

Open fredfp opened this issue 1 year ago • 7 comments

The handling of some uncaught fatal errors can be customized by overriding cats.effect.IOApp.reportFailure.

However, such customization is currently not possible in many cases (all usages of java.lang.Throwable.printStackTrace(), for example here).

Would it be possible to handle similar situations like in the fs2 reactive-streams integration here i.e., by using (if defined) java thread's default UncaughtExceptionHandler?

This would allow one to redirect uncaught errors to a custom logger, so that it gets properly ingested in a logging stack/system.

fredfp avatar Feb 06 '24 14:02 fredfp

I think there are only a couple areas (at the edge of the world, basically) where we forcibly print the stack trace. In most cases, we rely on ExecutionContext#reportFailure, which defaults to delegating to IOApp#reportFailure, which can be overridden to whatever you want. I think it would be totally reasonable to swap these last printStackTrace()s for reportFailure on the IORuntime#compute executor, which would link all of the machinery back together again.

djspiewak avatar Feb 06 '24 15:02 djspiewak

Usual failure reporter (i.e., ExecutionContext#reportFailure) have the following signature: Throwable => Unit, whereas for IOApp#reportFailure it is Throwable => IO[Unit]. Did you have that in mind in your comment above?

I also wouldn't rely on (or require) an IOApp instance to be available and in scope.

Wouldn't it be better to do it the other way round? i.e., rely on an UncaughtExceptionHandler, and call it from everywhere (via ExecutionContext#reportFailure) and from IOApp#reportFailure's default implementation?

fredfp avatar Feb 06 '24 18:02 fredfp

Usual failure reporter (i.e., ExecutionContext#reportFailure) have the following signature: Throwable => Unit, whereas for IOApp#reportFailure it is Throwable => IO[Unit]. Did you have that in mind in your comment above?

I was thinking of runtime.compute.reportFailure, which has signature Throwable => Unit. That should work for both of the IO call sites which print stack traces. Dispatcher might be another interesting question, but in that case we do actually have a running IOApp somewhere, so we can probably work around it by just submitting things back to the worker via Dispatcher#unsafeRunAndForget.

Wouldn't it be better to do it the other way round? i.e., rely on an UncaughtExceptionHandler, and call it from everywhere (via ExecutionContext#reportFailure) and from IOApp#reportFailure's default implementation?

What thread though? One of the problems here is the fact that this callback will be invoked on a different thread than you might be expecting, and likely one which is under the control of the IORuntime not under user control. In that kind of scenario it simply isn't worth delegating through the Thread machinery when the ExecutionContext machinery is already in hand and more efficient.

djspiewak avatar Feb 06 '24 21:02 djspiewak

I've seen quite a few cases where an IORuntime, and Dispatchers are used but there's no IOApp anywhere. With that in mind, would it make sense to assume we'll rely on failure reporters of IORuntime (i.e., runtime.compute.reportFailure), and make sure those are configurable somehow (when an IOApp is used, IOApp#reportFailure is used to configure the failure reporter of the IORuntime somehow)?

fredfp avatar Feb 14 '24 10:02 fredfp

The reportFailure parameter of the IORuntime constructor takes a function Throwable => Unit. This is what IOApp is wiring up to the IOApp#reportFailure method.

I think that all that really needs to be changed here is the hard-coded printStackTrace should be replaced with a runtime.compute.reportFailure. We should probably also check that Dispatcher has the equivalent change.

djspiewak avatar Feb 14 '24 16:02 djspiewak

Please have a look at this proposal, it is minimal in that:

  • it assumes usage of an IOApp (i.e., didn't check yet if the same could be achieved for those creating an IORuntime oustside of an IOApp)
  • doesn't work for anything outside of the IOApp.run (especially error handling related to what's pulled from cats.effect.IOApp.queue

There are also a few usages of System.err for error reporting, I'd be tempted to find a similar way to allow redirecting them to another facility, thoughts on that?

fredfp avatar Mar 19 '24 12:03 fredfp

Also, the proposed fix would need to be slightly adapted if https://github.com/typelevel/cats-effect/pull/4010 is accepted

fredfp avatar Mar 19 '24 12:03 fredfp