DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

Hive Paritioning isn't working, the DD part of YYYY-MM-DD is day of year (133)

Open brooks-j opened this issue 3 years ago • 1 comments

Related Template(s)

Cloud_PubSub_to_Avro

What happened?

This is using Google's Dataflow.

Here are some slices of the relevant terraform code. This is the template being used.

template_gcs_path = "gs://dataflow-templates/latest/Cloud_PubSub_to_Avro"

And here is the date formatting.

outputDirectory = "gs://${var.gcs_path}/${var.topic_name}/dt=YYYY-MM-DD"

Instead of getting dt=YYYY-MM-DD, I'm seeing this: dt=2022-05-133

Note the day of year at the end of the date string.

Beam Version

2.35.0

Relevant log output

No response

brooks-j avatar May 13 '22 21:05 brooks-j

I run into the same problem. It seems to be caused by an error in WindowedFilenamePolicy in common package. I created a PR to fix it.

kwoz avatar May 25 '22 18:05 kwoz

Fixed

pranavbhandari24 avatar Nov 28 '22 14:11 pranavbhandari24