fix(starrocks): exp.Unnest transpilation
Fixes #3962
This PR adds support for the following:
- Default initialize
exp.UNNESTto include "unnest" as the default table alias if not specified - Added expression type of
exp.Inline - use arrays_zip to merge multiple Lateral views
- Added some ut to validate unnest transpile
I tried to add the logic of setting default value when there is no table alias, PTAL @georgesittas @VaggelisD
I need to transpile the starrocks sql:
SELECT student, score, unnest FROM tests CROSS JOIN LATERAL UNNEST(scores)
Expected spark sql:
SELECT student, score, unnest FROM tests LATERAL VIEW EXPLODE(scores) unnest AS unnest
I found that when unnest has multiple outputs, it will be transpiled into multiple Lateral view statements (actually incorrect), as shown in the following SQL of starrocks:
SELECT id, t.type, t.scores FROM example_table CROSS JOIN LATERAL unnest(split(type, ";"), scores) AS t(type,scores)
Transpiled into spark, the output is:
SELECT id, t.type, t.scores FROM example_table LATERAL VIEW EXPLODE(SPLIT(type, CONCAT('\\Q', ';'))) t AS type LATERAL VIEW EXPLODE(scores) t AS scores
Incorrect reason:
The output of unnest is the combined value of multiple columns, and LATERAL VIEW will be expanded multiple times and cross joined.
We can use arrays_zip to merge multiple Lateral views.
In fact, our users often use the syntax combination LATERAL VIEW INLINE(arrays_zip(col1, col2)).
table function inline doc as following:
databricks: https://docs.databricks.com/en/sql/language-manual/functions/inline.html spark: https://spark.apache.org/docs/latest/api/sql/#inline hive:https://stackoverflow.com/questions/25088488/use-inlinearraystruct-struct-in-hive
Expected output:
SELECT id, t.type, t.score FROM example_table LATERAL VIEW INLINE(ARRAYS_ZIP(SPLIT(type, CONCAT('\\Q', ';')), scores)) t AS type, score
But function arrays_zip may not be supported by some engines, such as hive. I added a parameter unnest_using_arrays_zip to avoid hive being affected.
I added a commit. Will be open to any suggestions =)
@VaggelisD Added new commit, PTAL
@VaggelisD Good evening, I made some change, please take a look at it when you have time
@georgesittas @VaggelisD Added new commit, PTAL
@georgesittas Agree. Thanks for the previous advice, it helped me a lot =)
Sorry, I cleaned up my sqlglot repository because it was a mess. Wait a minute and I'll make another commit. There won't be any new code since our last commit.