temporal
temporal copied to clipboard
Matching service optimization - Do not persist sticky workflow task
Sticky workflow tasks usually have short timeout, in terms of seconds.
When a sticky workflow task cannot be sync-matched, matching service right now will persist the task into DB and wait for SDK with local cache to pick up that task.
If sticky workflow task cannot be sync-matched, it (sometimes? usually?) means SDK is unavailable and history service will timeout the task few seconds later. If above case happen, DB IOPS are wasted. (matching service persist the task, & maybe read it back)
Matching service should not persist sticky workflow task, by
- either when unable to sync match, return error and let history service retry for few times before give up
- or using longer sync match timeout, if unable to sync match, give up immediately
- change history service to dispatch task, preferable to the SDK queue (SDK which has local cache) default to normal queue; when sync match timeout for SDK queue, put the task to normal queue
NOTE: history service will timeout the sticky workflow task & create a normal workflow task, workflow will NOT be stuck