Signal to self error in Workflow code
Expected Behavior
A workflow should be able to signal any workflow using a workflow ID, including itself.
Actual Behavior
Throws a UnknownExternalWorkflowExecution when attempting to signal itself.
Steps to Reproduce the Problem
- Try the following workflow code:
func TestWorkflow(ctx workflow.Context) error {
we := workflow.GetInfo(ctx).WorkflowExecution
err := workflow.SignalExternalWorkflow(ctx, we.ID, we.RunID, "messages", "hello").Get(ctx, nil)
if err != nil {
panic(fmt.Errorf("could not signal external workflow: %v", err))
}
return nil
}
-
Error shows in temporal web history:

-
The worker logs the error:
2020/08/18 21:29:57 ERROR Workflow panic. Namespace default TaskQueue test WorkerID 465773@beast-ubuntu@ WorkflowID TestWorkflow RunID 254db1e2-ebcc-489a-9835-1f5e2e3fb8f8 PanicError could not signal external workflow: UnknownExternalWorkflowExecution PanicStack coroutine root [panic]:
myproject.com/domain/root.TestWorkflow(0x1003500, 0xc000498a00, 0x0, 0x0)
/home/paul/myproject.com/domain/root/testworkflow.go:41 +0x17a
reflect.Value.call(0xd69c40, 0xefbd48, 0x13, 0xeaf2fa, 0x4, 0xc000486de0, 0x1, 0x1, 0xfd2270, 0xe8e420, ...)
/usr/local/go/src/reflect/value.go:460 +0x8ab
reflect.Value.Call(0xd69c40, 0xefbd48, 0x13, 0xc000486de0, 0x1, 0x1, 0xfd2270, 0xe8e420, 0xc0004b4640)
/usr/local/go/src/reflect/value.go:321 +0xb4
go.temporal.io/sdk/internal.(*workflowEnvironmentInterceptor).ExecuteWorkflow(0xc0004b4640, 0x1003500, 0xc000498a00, 0xc0000362c0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/home/paul/go-sdk/internal/workflow.go:370 +0x2b2
go.temporal.io/sdk/internal.(*workflowExecutor).Execute(0xc000498940, 0x1003500, 0xc000498a00, 0x0, 0xc0004f3738, 0xc68596, 0x0)
/home/paul/go-sdk/internal/internal_worker.go:759 +0x334
go.temporal.io/sdk/internal.(*syncWorkflowDefinition).Execute.func1(0x10036c0, 0xc00009b0e0)
/home/paul/go-sdk/internal/internal_workflow.go:491 +0xf3
^C2020/08/18 21:30:00 INFO Stopped Worker Namespace default TaskQueue test WorkerID 465773@beast-ubuntu@
Specifications
- Version: 0.28.0
- Platform: Helm-chart
This is by design. Server has special code to handle this case: https://github.com/temporalio/temporal/blob/0ae59f62c4d0282aa5b7821706540a7f049473a4/service/history/transferQueueActiveTaskExecutor.go#L456. You may also check: https://github.com/uber/cadence/pull/539 and https://github.com/uber/cadence/issues/531.
You'll have to forgive me as I'm not very familiar with how signals are handled under the hood. IMHO, from an end-user perspective, it would be nice if we could SignalWorkflow + SignalWorkflowWithStart from within workflow code and not have to worry about making sure the target workflow is external. Is there a reason we have a special SignalExternalWorkflow instead of just SignalWorkflow(withStart bool) for workflow context?
Note: I'm aware of the performance implications when sending a signal to self rather than just directly placing in some internal queue. That's a decision I'd like to make for myself as an end-user. If I'm enforcing a signal-driven architecture a top temporal, than I'd like all tasks to be driven by actual Signals that are forced through temporal. Circumventing this feels like a hidden side effect and will not show in monitoring / middleware that I might utilize in the future.
Temporal server does not have support for signal command which tries to send a signal to itself. Implementation of signal command processing is written in a way where it holds a lock on the source execution while it is writing a signal to target execution. If source and target are the same then it results in a deadlock and transfer task processing for that signal command will never succeed. This is the reason server has a protection in place to not allow sending signal to itself. I agree that it is a side effect of particular implementation choice made for processing of signal command. But I don't think this is a bug. I will convert this into a feature request which we could potentially consider to support in the future.
I still think that a workflow sending a signal to itself does not make much sense as it could simplify send a message through a channel. I don't see why it requires a full round trip back from server just to deliver a message on signal channel.
Probably we should improve error message there though. Report something like WorkflowCantSignalItself instead of UnknownExternalWorkflowExecution we currently have.
@samarabbas that makes sense. I think the only reason would be to have clear metrics. For instance, Signals per minute in Grafana would not be a true representation of a signal driven architecture if a significant amount of self signals are not being sensed.
I have things working for now using a local workflow channel, and piping into that from Signal channel and local self signals.
I would appreciate having this as an enhancement, but if it’s a serious anti pattern to the way things are working under the hood I definitely would not make sacrifices for it.
Thanks.
Understood. I have converted this to a future enhancement which we can re-evaluate later. For now looks like you have a reasonable workaround which is working.
Hi @samarabbas, any movement on this? I've just encountered the same problem, and would love to help work towards a solution here