pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Add support for EXTRACT syntax and converts it to appropriate Pinot expression

Open tanmesh opened this issue 3 years ago • 4 comments

Description

This PR adds support EXTRACT syntax and converts it to its Pinot expression.

This PR will solve the following issue -- https://github.com/apache/pinot/issues/9075

Testing

Verified the desired behavior locally by running CalciteSqlCompilerTest

tanmesh avatar Aug 09 '22 16:08 tanmesh

thanks, @Jackie-Jiang for providing context. Very helpful.

I am trying to understand how ExtractTransformFunction as you suggested will be used in the overall flow (write path and read path). Few follow-up ques from my investigation:

  • Do we store the transformed data (after applying transformations as defined here on raw data) into segments or only raw data into segments and apply the transformations on the fly during Pinot query execution?

tanmesh avatar Aug 15 '22 17:08 tanmesh

ExtractTransformFunction will only be applied at query time. If we want to support ingestion transform, we need to add a ScalarFunction for the extract (you may take a look at DateTimeFunctions` class).

To support this feature, we need 2 parts of the changes:

  1. Calcite parser change to support extract syntax and parse the query into extract(field, expression)
  2. Add ExtractTransformFunction or ScalarFunction to support query/ingestion time transform

Jackie-Jiang avatar Aug 15 '22 17:08 Jackie-Jiang

Hey @Jackie-Jiang , I have completed part 1 of the task.

For Part2: Can you please help me point to where ‘DateTimeTransformFunction’ (or other query time transformers defined here) are getting applied? I was able to locate TransformFunctionFactory but couldn’t locate where transformations are actually getting applied.

tanmesh avatar Aug 17 '22 18:08 tanmesh

PTAL @Jackie-Jiang when you get a chance. Thanks!

tanmesh avatar Aug 19 '22 18:08 tanmesh