DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

BigQuery to Elasticsearch Data Flow indexes all columns as string in elasticsearch

Open elangoshifts opened this issue 4 years ago • 2 comments

We supply custom query to Data Flow template and the query is a nested select statement.

select col1,col2,col3 from tablename;

when we run the query in BigQuery console, we see integer values and the column is also of Integer data type.

Once we run the dataflow template, the attributes in Elasticsearch are showing as string values.

Does the dataflow template handle data types of custom queries.

elangoshifts avatar Nov 29 '21 08:11 elangoshifts

You probably want to create the index and define the mappings ahead of time. I'm going to be testing this myself this week.

shadiramadan avatar Feb 02 '22 06:02 shadiramadan

I can confirm that even with the newest version of Apache Beam (SDK for Java 2.39.0), all types are converted to strings in Elasticsearch. This is not ideal for the getting started experience, IMO. I assume that the Beam implementation for Elasticsearch puts all values into quotes and disregards types. Any chance we could fix this behavior?

MarxDimitri avatar Aug 23 '22 13:08 MarxDimitri