Fix the druid json_object issue
Fixes #16356.
Description
the dependence calcite cr need to be review the root case is that the operator is SqlFunction, but in the map, the key instance is SqlJsonObjectFunction, the type is not equals, so we can naver get from the map in the following code:
public @Nullable RexCallImplementor get(final SqlOperator operator) {
if (operator instanceof SqlUserDefinedFunction) {
org.apache.calcite.schema.Function udf =
((SqlUserDefinedFunction) operator).getFunction();
if (!(udf instanceof ImplementableFunction)) {
throw new IllegalStateException("User defined function " + operator
+ " must implement ImplementableFunction");
}
CallImplementor implementor =
((ImplementableFunction) udf).getImplementor();
return wrapAsRexCallImplementor(implementor);
} else if (operator instanceof SqlTypeConstructorFunction) {
return map.get(SqlStdOperatorTable.ROW);
}
return map.get(operator);
}
Fixed the bug ...
Release note
Fix the SQL JSON_OBJECT() function results in RUNTIME_FAILURE when querying INFORMATION_SCHEMA.COLUMNS
Key changed/added classes in this PR
-
NestedDataOperatorConversions.java
This PR has:
- [x] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.
So to resolve this problem, we need to update the calcite to a release that includes your fix?
can we add a test case?
@FrankChen021, @kgyrtkirk and @cryptoe i am creating the test case in calcite first, still working on this. After, i will create the test case in druid
So you want all functions to work with equalsIgnoreCase ?
select fruit, sum(price) from foo group by 1 = select fruit, sUm(price) from foo group by 1 ?
I wonder why can't we simply use the SqlStdOperatorTable.JSON_OBJECT instead of declaring a new sqlfunction in NestedDataOperatorConversions.
public static class JsonObjectOperatorConversion implements SqlOperatorConversion
{
private static final SqlFunction SQL_FUNCTION = SqlStdOperatorTable.JSON_OBJECT;
[...]
I've done a quick check with a testcase in CalciteSysQueryTest
@Test
public void testJsonObject()
{
testBuilder()
.sql(
"select JSON_OBJECT('name': COLUMN_NAME, 'type': \"DATA_TYPE\")"
+ "from INFORMATION_SCHEMA.COLUMNS order by COLUMN_NAME limit 1"
)
.expectedResults(
ImmutableList.of(
new Object[] {"{\"name\":\"CATALOG_NAME\",\"type\":\"VARCHAR\"}"}
)
)
.run();
}
I'm not sure if other tests will pass with this change...but SqlToRel was using a direct comparision check on the builtin operator; we have a few of those for the JSON_ functions...maybe the best would be to use the SqlStdOperator versions for the ones SqlToRel does special handling?
...other way around could be to somehow get rid of the Bindables mode - so that this query gets executed the same way as regular Druid queries
Bindablesmode hI @kgyrtkirk Thanks for your suggestion, I have switch to SqlStdOperatorTable.JSON_OBJECT
May i know what is your idea of SqlStdOperator versions solution?
@kgyrtkirk i have try to add some calcite rule for plan, but it still need to get sqlfunction from rextableimp, i do not have any other work around now, i am open if someone can provide idea or continue to work on this, i can monitor and learn.
I think that getting rid of the bindables mode might just make this and possibly other issues go away...
I was looking around a bit and I think you will get similar issues for all the functions inside NestedDataOperatorConversions
to address that I think the following might be considered:
- fix these
SqlFunction-s in NestedDataOperatorConversions ; to be usable bindable functions- make the builder return a SqlUserDefinedFunction
- which will need an ImplementableFunction
- preferably that function should utilize the the original udf implementatiion behind the scenes to process the data thru the CallImplementor
- make the
bindablesgo away- the reason these are around is due to the fact that some tables are only provided in a way the
bindablewants them here - fixing that would be to make sure that all tables can be unwrapped to DruidTable
- easiest option would be to provide them as an
InlineTable; however that would mean all those values will appear in the plans - so a new subclass ofDruidTablewould be needed - which could accessinformation_schemastuff
- the reason these are around is due to the fact that some tables are only provided in a way the
I would preffer (2) as I have a feeling that (1) might be quite tricky.... Removing bindables mode have further benefits; it will make those queries work the same as regular tables with joins/etc; and it would reduce the complexity a bit.
I'll try to think about further alternatives/ideas/etc
cc: @gianm
@kgyrtkirk thanks let me know your solution, i will try this week.