DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

[Bug]: GSC to PubSub losing messages

Open Wuerike opened this issue 1 year ago • 0 comments

Related Template(s)

GCS_Text_to_Cloud_PubSub

Template Version

2024-09-19-00_rc00

What happened?

I'm using de GCS to PubSub batch template to publish and consume messages from JSONL files.

I've noticed I'm losing messages in the process, I'm not sure the ratio of missing messages but I have to run the job two to three times to successfully receive a few thousands messages (something like 10k to 20k)

I'm just worried because I'll have to do this process for billions of messages, and I definitely would not like to process this amount of data many time to ensure I wont be losing anything

Is there some know bug about it? Or something I cant do to to enforce the job to be more precise?

Relevant log output

No response

Wuerike avatar Sep 26 '24 00:09 Wuerike