eventing icon indicating copy to clipboard operation
eventing copied to clipboard

Avoid event loops when Triggers is configured to a Sink that returns new events

Open odacremolbap opened this issue 3 years ago • 4 comments

Problem

A sink receiving a CloudEvent has the choice of replying with a new event, which enables scenarios that query external systems or do transformations. It is expected that the new event type will be different than the incoming type:

In

type: math.operation
id: 1234abcd
data: "1 + 1 = 2"

Out

type: math.operation.english
id: 1234abcd
data: "one plus one equals two"

When connected to a Broker a Trigger for this sink should look like this to prevent the sink from consuming its own generated events :

apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: englishfy
spec:
  broker: default
  filter:
    attributes:
      type: math.operation
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: englishfy

A modified version of that Trigger that lacks the filter element would lead to event loops and most probably to high CPU consumption at the sink and broker dispatcher.

An extra option at filters (set by default) could prevent users from creating those loops inadvertently.

If the solution were based on the configured filter it could go:

spec:
  filter:
    allowRepliesMatchingFilter: {true|false}
    attributes:
      type: math.operation

I would prefer a different solution where the Broker is made aware that a returned event should not be delivered back to the sink producing it but could be delivered to other subscribers, which would work even for cases when the Trigger is filter-less, but I think that would be technically challenging.

spec:
  triggerOptions:
     avoidOwnReplies: {true|false}

Persona: Knative user

Exit Criteria Users that forget to add, or are not experienced with defining proper filtering, won't get event loops by default when creating Triggers to Sinks that generate responses.

If the first solution in this post is chosen, and returned events are discarded if they match the configured filters, we should have some clear notification mechanism ( logger.Error ? if we don't come up with something better ) to inform users about the issue.

Time Estimate (optional): NA

Additional context (optional) NA

@evankanderson ^

odacremolbap avatar May 18 '22 17:05 odacremolbap

I worry about avoidOwnReplies still being able to loop if you have two repliers on the same type but different Triggers. Having to explicitly indicate that you've thought about loops feels a bit like a modal "Really delete Thesis.docx?" prompt.

Internally, it should be possible for implementations to do "avoid own" by using a non-CE message attribute to store the UID of the trigger associated with the reply.

evankanderson avatar May 18 '22 17:05 evankanderson

The math example is great, because you might deliberately do recursive delivery if you've framed the math as an RPN expression, and the function resolves one frame and emits a reply.

This might be a good candidate for a 1.1 spec update, but let's try it first.

evankanderson avatar May 18 '22 17:05 evankanderson

IIRC that's the main reason why the broker ingress add/check the TTL CE attribute.

lionelvillard avatar May 23 '22 18:05 lionelvillard

Yes, we implemented a TTL on this, but that behavior seems at least as surprising (your event loops, until it doesn't, plus it's a CE attribute, IIRC, so that it would be echoed if you copy the original event).

I think it's worth discussing whether any of these extensions would be more developer-friendly, even if the answer is "no" -- maybe something to get @csantanapr and the UX WG in to do a user study on if we have the capability (I know it can be hard to run such a study).

evankanderson avatar May 24 '22 21:05 evankanderson

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Aug 23 '22 01:08 github-actions[bot]