Description

This is probably just me not understanding how things are supposed to work.

I have created a user-defined source, based on the async source example that sets up a REST API to accept requests that execute database queries and generate Numaflow messages for a pipeline to work off.

I am not sure what the read_handler function should return when there aren't any results to pass on (this could be just because we are waiting for another REST request).

I tried just breaking out of the iterator but that resulted in a "Readiness probe" failure so K8s will restart the pod.

To Reproduce

Steps to reproduce the behavior:

Modify the async-source example.py so that the read_handler returns after some number of messages, rather than running forever.

Quick and dirty:

From:

for x in range(datum.num_records):

To:

for x in range(self.read_idx, datum.num_records):

Build the image
Deploy the pipeline
Monitor the deployment (k9s)

Expected behavior

I thought that the source would stop producing messages so the pipeline would flush all the queues and then wait for more work (which will never come in this test case, but could in the REST API scenario described above).

Environment

Kubernetes: v1.27.6+k3s1
Numaflow: quay.io/numaproj/numaflow:v1.1.1
Numalogic: unknown (please advise where I might find this information)
Numaflow-python: 0.6.0

Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

Dec 27 '23 16:12 tolmanam

Is the expected behavior for the read_handler to run, forever, and just block while there is no data to pass along? I always worry about waiting for things indefinitely.

Dec 27 '23 16:12 tolmanam

FWIW -

I also see this same "Readiness probe failed" if the read_handler takes too long to respond.

Rather than limiting the number of responses as described above, you can just add a long sleep (longer than the readiness probe) inside the loop.

Dec 27 '23 17:12 tolmanam

Hey @tolmanam I was trying to replicate the issue with the steps you provided and I had a quick question, Were you seeing a pipeline deletion due to pods autoscaling down to 0 because of no traffic or was a crash seen at your end?

Jan 12 '24 19:01 kohlisid

I believe it was Kubernetes killing the pod because it failed the "Readiness probe".

Consider the use case that you want to run a database query that generates X number of messages every 10 minutes. You wouldn't want autoscaling to drop the vertex.

FWIW - I swapped out the UDF source with the built-in HTTP source, and it runs happily without adding any messages to the pipeline until receiving a POST, so the behavior I would like is compatible with Numaflow, I just don't appear to know how to build a User Defined Source.

Jan 14 '24 09:01 tolmanam

User Defined Async Source - "Readiness probe failed" when there are no more messages

Description

To Reproduce

Quick and dirty:

Expected behavior

Environment