flame icon indicating copy to clipboard operation
flame copied to clipboard

Make the `FLAME.Pool` handle backend.init/1 errors

Open mentels opened this issue 1 year ago • 0 comments

Even though the FLAME.Backend allows for the backend.init/1 to return an error:

@callback init(opts :: Keyword.t()) :: {:ok, state :: term()} | {:error, term()}

The FLAME.Pool would not handle it and get into an infinite restart loop. This behavior can be tested with FlameTest.Application.call in the https://github.com/mentels/flame_test that uses a custom backend which init/1 returns {:error, just_fail}.

If the pool is configured with min: 0, after the call we will keep getting:

iex(1)> FlameTest.Application.call
:ok

13:35:17.334 [error] Task #PID<0.394.0> started from FlameTest.Pool terminating
** (MatchError) no match of right hand side value: {:error, :just_fail}
    (flame 0.5.1) lib/flame/pool.ex:696: FLAME.Pool.start_child_runner/2
    (elixir 1.17.2) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
    (elixir 1.17.2) lib/task/supervised.ex:36: Task.Supervised.reply/4
Function: #Function<1.81824075/0 in FLAME.Pool.async_boot_runner/1>
    Args: []

13:35:18.353 [error] Task #PID<0.396.0> started from FlameTest.Pool terminating
** (MatchError) no match of right hand side value: {:error, :just_fail}
    (flame 0.5.1) lib/flame/pool.ex:696: FLAME.Pool.start_child_runner/2
    (elixir 1.17.2) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
    (elixir 1.17.2) lib/task/supervised.ex:36: Task.Supervised.reply/4
Function: #Function<1.81824075/0 in FLAME.Pool.async_boot_runner/1>
    Args: []

13:35:19.354 [error] Task #PID<0.398.0> started from FlameTest.Pool terminating
** (MatchError) no match of right hand side value: {:error, :just_fail}
    (flame 0.5.1) lib/flame/pool.ex:696: FLAME.Pool.start_child_runner/2
    (elixir 1.17.2) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
    (elixir 1.17.2) lib/task/supervised.ex:36: Task.Supervised.reply/4
Function: #Function<1.81824075/0 in FLAME.Pool.async_boot_runner/1>
    Args: []
...

While if the runners are started at the application start-up (e.g.: min: 1), the application will not start at all, reporting an error:

13:37:00.209 [notice] Application flame exited: :stopped
** (Mix) Could not start application flame_test: FlameTest.Application.start(:normal, []) returned an error: shutdown: failed to start child: {FLAME.Pool, FlameTest.Pool}
    ** (EXIT) shutdown: failed to start child: {FLAME.Pool, FlameTest.Pool}
        ** (EXIT) an exception was raised:
            ** (MatchError) no match of right hand side value: {:error, :just_fail}
                (flame 0.5.1) lib/flame/pool.ex:696: FLAME.Pool.start_child_runner/2
                (elixir 1.17.2) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
                (elixir 1.17.2) lib/task/supervised.ex:36: Task.Supervised.reply/4

This commit attempts to fix the error by:

  • preventing the MatchError resulting from the {:ok, pid} = DynamicSupervisor.start_child(state.runner_sup, spec) in the start_child_runner/2
  • adding more explicit error handling in the start_child_runner/2
  • catch an exit signal possibly resulting from the DynamicSupervisor.start_child/2 or Runner.remote_boot/2
    • the previous catch clause with {:exit, reason} was not catching the exit signal from the DynamicSupervisor.start_child/2 ```elixir iex(1)> try do throw({:exit, :ha}) catch {:exit, :ha} -> :haha end :haha iex(2)> try do exit(:ha) catch {:exit, :ha} -> :haha end ** (exit) :ha iex:2: (file) iex(2)> ````
  • adding a new handle_info/2 clause to handle the {:error, reason} Task result
  • adding new clauses to the fun in boot_runners/2
    • the previous {:exit, reason} clause would never match since the Pool GenServer is not trapping exists and thus the tasks spawn with Task.async_stream/3 would never return {:exit, reason} as mentioned in the https://hexdocs.pm/elixir/Task.html#async_stream/5

I think that the Runner.remote_boot/2 might have the same problem but I will stop here and see if this work lands anywhere - maybe I'm missing the point :)

mentels avatar Oct 31 '24 12:10 mentels