[CALCITE-6437]For druid sql JSON_OBJECT() function results in RUNTIME_FAILURE when querying INFORMATION_SCHEMA.COLUMNS
Fix the druid json_object issue.
Description
jira: CALCITE-6437 druid issue: https://github.com/apache/druid/issues/16356
- Now in druid, we will construct the SqlFunction instance as following:
private static final SqlFunction SQL_FUNCTION = OperatorConversions
.operatorBuilder(FUNCTION_NAME)
.operandTypeChecker(OperandTypes.variadic(SqlOperandCountRanges.from(1)))
.operandTypeInference((callBinding, returnType, operandTypes) -> {
RelDataTypeFactory typeFactory = callBinding.getTypeFactory();
for (int i = 0; i < operandTypes.length; i++) {
if (i % 2 == 0) {
operandTypes[i] = typeFactory.createSqlType(SqlTypeName.VARCHAR);
continue;
}
operandTypes[i] = typeFactory.createTypeWithNullability(
typeFactory.createSqlType(SqlTypeName.ANY),
true
);
}
})
.returnTypeInference(NESTED_RETURN_TYPE_INFERENCE)
.functionCategory(SqlFunctionCategory.SYSTEM)
.build();
- We try to get SqlJsonObjectFunction as following:
public @Nullable RexCallImplementor get(final SqlOperator operator) {
if (operator instanceof SqlUserDefinedFunction) {
org.apache.calcite.schema.Function udf =
((SqlUserDefinedFunction) operator).getFunction();
if (!(udf instanceof ImplementableFunction)) {
throw new IllegalStateException("User defined function " + operator
+ " must implement ImplementableFunction");
}
CallImplementor implementor =
((ImplementableFunction) udf).getImplementor();
return wrapAsRexCallImplementor(implementor);
} else if (operator instanceof SqlTypeConstructorFunction) {
return map.get(SqlStdOperatorTable.ROW);
}
return map.get(operator);
}
- Here is issue, the operator is SqlFunction, but in the map, the key instance is SqlJsonObjectFunction, the type is not equals, so we can naver get from the map
- So i try to fix this in this cr as following in SqlToRelConverter. We can overwrite to be SqlJsonObjectFunction
- And also probably there is other way to fix: add one more construction of SqlJsonObjectFunction as following
public SqlJsonObjectFunction(SqlFunction baseFunction)
and update JsonObjectOperatorConversion in druid as following:
SQL_FUNCTION = SqlJsonObjectFunction(OperatorConversions
.operatorBuilder(FUNCTION_NAME)
.operandTypeChecker(OperandTypes.variadic(SqlOperandCountRanges.from(1)))
.operandTypeInference((callBinding, returnType, operandTypes) -> {
RelDataTypeFactory typeFactory = callBinding.getTypeFactory();
for (int i = 0; i < operandTypes.length; i++) {
if (i % 2 == 0) {
operandTypes[i] = typeFactory.createSqlType(SqlTypeName.VARCHAR);
continue;
}
operandTypes[i] = typeFactory.createTypeWithNullability(
typeFactory.createSqlType(SqlTypeName.ANY),
true
);
}
})
.returnTypeInference(NESTED_RETURN_TYPE_INFERENCE)
.functionCategory(SqlFunctionCategory.SYSTEM)
.build());
- I am not sure which one is better, could you guy help to look, or better solution?
@AlbericByte Please create a new issue in JIRA.
@AlbericByte Please create a new issue in JIRA.
add the jira in the description : CALCITE-6437 Thanks for notification
In order to make it easier to keep track of the correspondence of issues in JIRA and github the issue title should match exactly the JIRA title. The commit that fixes the issue should also have the same message.
Please add a unit test that fails before the fix. The file SqlOperatorTest may be the right place for it.
In order to make it easier to keep track of the correspondence of issues in JIRA and github the issue title should match exactly the JIRA title. The commit that fixes the issue should also have the same message.
Got it, thanks for the tips
Please add a unit test that fails before the fix. The file SqlOperatorTest may be the right place for it. @mihaibudiu and @NobiGo added more test case, thanks for help
the issue arises from the fact that Druid tries to implement JSON_OBJECT with its own SqlFunction
Comparing against SqlStdFunctions.X enforces users of Calcite to use those functions. I think instead of resorting to String comparision it would be more straightforward to compare them based on their SqlKind instead.
So I think another way to solve this is to introduce SqlKind.JSON_OBJECT (and possibly do that for all SqlStdOperators ? )
How does that sound?
@kgyrtkirk
- I have tried this solution to add SqlKind.JSON_OBJECT, and add this as following:
public static final Set<SqlKind> FUNCTION = EnumSet.of(OTHER_FUNCTION, ROW, TRIM, LTRIM, RTRIM, CAST, REVERSE, JDBC_FN, JSON_OBJECT, JSON_ARRAY POSITION, CONVERT);
- But after modification, the test case will be failed
expr("json_arrayagg(json_array(\"column\") format json)") .ok("JSON_ARRAYAGG(JSON_ARRAY(column ABSENT ON NULL) FORMAT JSON ABSENT ON NULL)");
The expected result will be
"JSON_ARRAYAGG((JSON_ARRAY(column ABSENT ON NULL)) FORMAT JSON ABSENT ON NULL)"
- i need time to figure why there is one more parentheses, still need @NobiGo 'suggestion, does this solution is good or not?
Quality Gate passed
Issues
4 New issues
0 Accepted issues
Measures
0 Security Hotspots
85.7% Coverage on New Code
0.0% Duplication on New Code
@NobiGo @kgyrtkirk @mihaibudiu in Druid the return type of json_object function is ExpressionType.NESTED_DATA, but in Calcite, the return type of json_object is VARCHAR(2000).
From current calcite logic, if can not find the sqlfunction in RexImpTable, calcite will throw exception. And actually in druid, we have implement json_object in JsonObjectExprMacro. is there any api to register new SqlJsonObjectFunction with NESTED_DATA in druid? so i can register a new SqlJsonObjectFunction with NESTED_DATA output.
i tried to reused SqlStdOperatorTable.JSON_OBJECT in druid, and also add cast or json_query to wrap the json_object, but it will fail because the SqlStdOperatorTable.JSON_OBJECT return varchar, or cast donot support NESTED_DATA.
seems i can not find better solution outside calcite, could you please give me a guidance. Thanks
What is the status of this PR? Is it ready or more work is needed?
There are many functions in Calcite which are "overloaded", which behave differently depending on the SQL dialect. You can define a new function, with the same name, but a different implementation for Druid.
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.