sql icon indicating copy to clipboard operation
sql copied to clipboard

Use Calcite's validation system for type checking & coercion

Open yuancu opened this issue 2 months ago • 4 comments

Description

Previously, we have implemented our own type checking mechanism due to a lacking of SQL validation phase, which relies on SqlNode, and compatibility issues with user-defined types.

With this PR, I try to solve these problems and migrate to Calcite's type checking mechanism by introducing a SqlNode layer. This is the Approach 1 described in #3865

It has the following advantages:

  • A full-frame type checking mechanism. We used to support only chains of family operand type checking. Calcite's built-in type checkers additionally supports variable-length, same-operand, or-relationship type checkers.
  • Omitting unnecessary function implementations by re-writing the SqlNode. E.g. we can replace a call to sqrt(x) by replacing it with pow(x, 0.5) in the SqlNode layer.
  • Automatic type casting when necessary

Please feel free to ask why do you change this and why does that no longer match during review!

Work items:

  • [x] Re-introduce SqlNode: rel node -> sql node -validation&coercion-> sql node -> rel node
  • [x] Enable type casting
  • [x] Support UDT: Calcite's type checking relies on enum SqlTypeName, therefore there is no space for UDT there. I did the trick by replacing the operand types of a function call before type checking and restore the types after type checking. E.g. EXPR_TIME -> SqlTypeName.TIME (1090/1599 ITs passed)
  • [x] UDT compatibility in a deeper level: we converted UDT to sql type to use clacite's type checking system (e.g. SqlTypeName.TIMESTAMP). However, when the return type is related to the argument type, this will lead the return type to be sql type instead of UDTs, leaving to an incorrect plan and validation failures. We need to find a way that only use UDT at the top level, using SQL types in intermediate steps.
  • [x] Cross type comparison: comparison between different datetime types (e.g. date v.s. type) is not supported in SQL:
    SELECT * FROM EMPS WHERE JOINEDAT=TIME(12,0,0);
    Error: Cannot apply '=' to arguments of type '<DATE> = <TIME(0)>'. Supported form(s): '<COMPARABLE_TYPE> = <COMPARABLE_TYPE>' (state=,code=0)
    
    We have to replicate #1196 to manually add conversion. Fixed by overriding commonTypeForBinaryComparison (1240/1599 ITs passed)
  • [x] There is no IP type in calcite. For the purpose of type checking, I may have to use an existing sql type to fake it.
  • [x] Support overloads of operators
    • [x] Overload IP comparison: convert call to ip comparator when converting sql node to rel node if either operand is IP.
    • [x] ATAN function has a one-operator and a two-operator version: define a new operator that rewrites sql call conditionally
    • [x] ADD can be string concatenation and number addition - fixed by resolve call conditionally in FuncImpTable
  • [x] Null direction is not preserved in SQL node
  • [x] Rewritten sort orders are not preserved
  • [x] ANTI and SEMI joins are not supported
  • [x] IT Fixes
    • [x] CalcitePPLDateTimeBuiltinFunctionIT
      • [x] semantics of interval mismatch in sql and ppl
    • [x] CalcitePPLBuiltinFunctionIT
      • [x] LOG(a, b)
      • [x] sarg deserialization bug (in testRand)
    • [x] CalciteArrayFunctionIT
      • [x] type checkers for all array functions are not defined
      • [x] reduce function's type inference is incomplete -- it will derive return types in both SqlNode and RelNode level, ensuring that they are the same. The current implementation works only at the RelNode level
    • [x] CalciteMathematicalFunctionIT
    • [x] CalcitePPLAggregationIT
    • [x] CalciteBinCommandIT -- Fixed by ignoring bin-on-timestamp as it is not implemented
      • [x] Problem 1: Cannot apply '-' to arguments of type '<TIMESTAMP(0)> - <TIMESTAMP(0)>'. Supported form(s): '<NUMERIC> - <NUMERIC>'\n'<DATETIME_INTERVAL> - <DATETIME_INTERVAL>'\n'<DATETIME> - <DATETIME_INTERVAL>'" in IT testStatsWithBinsOnTimeAndTermField_Avg and testStatsWithBinsOnTimeAndTermField_Count
      • [x] Problem 2: Windowed aggregate expression is illegal in GROUP BY clause in IT testStatsWithBinsOnTimeField_Count and testStatsWithBinsOnTimeField_Avg
    • [x] CalciteStreamstatsCommandIT
      • [x] Assisting columns in sort by are project
      • [x] Does not handle bucket_nullable correctly
    • [x] CalcitePPLJsonBuiltinFunctionIT: string-ify not correct
  • [x] No longer tolerant malformatted string numbers: Fixed by replacing CAST with SAFE_CAST
  • [x] Nullability attribute changes after SAFE_CAST so validation breaks: fixed by resetting nullability after coercion with SAFE_CAST
  • [x] RelHint on LogicalAggregate is lost during rel node -> sql node -> rel node round-trip conversion: fixed by adding SqlHint after converting LogicalAggregation
  • [x] Many push-down relies on specified sequences of logical operations. SqlNode -> RelNode changes the plan. Need to re-activate them.
  • [ ] Add unit & integration tests

Related Issues

Resolves #4636, resolves #3865

Check List

  • [ ] New functionality includes testing.
  • [ ] New functionality has been documented.
  • [ ] New functionality has javadoc added.
  • [ ] New functionality has a user manual doc added.
  • [ ] New PPL command checklist all confirmed.
  • [ ] API changes companion pull request created.
  • [ ] Commits are signed per the DCO using --signoff or -s.
  • [ ] Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

yuancu avatar Dec 02 '25 09:12 yuancu

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced type coercion system for improved implicit type conversion in queries
    • Extended support for Spark SQL dialect compatibility
    • Improved IP type handling and validation
  • Bug Fixes

    • Fixed type casting issues in filter and aggregation operations
    • Corrected timestamp and datetime type handling in expressions
    • Resolved operand type validation for various SQL functions
  • Documentation

    • Updated expected query plan outputs to reflect improved type handling

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Introduces a Calcite-backed validation and implicit type coercion pipeline for PPL by converting RelNode↔SqlNode around validation. Adds PPL-specific validator, coercion rules, converters/shuttles, and convertlets. Refactors type utilities/factories, operand metadata, and function registry. Updates numerous tests (JSON→YAML expectations) and docs. Adjusts window collation embedding and arithmetic/type derivation.

Changes

Cohort / File(s) Change Summary / Review Focus
Validation & Coercion Pipeline
core/.../QueryService.java, core/.../calcite/CalcitePlanContext.java, core/.../validate/PplValidator.java, core/.../validate/PplTypeCoercion.java, core/.../validate/PplTypeCoercionRule.java, core/.../validate/converters/PplRelToSqlNodeConverter.java, core/.../validate/converters/PplSqlToRelConverter.java, core/.../validate/shuttles/SqlRewriteShuttle.java, core/.../validate/shuttles/SkipRelValidationShuttle.java, core/.../validate/shuttles/PplRelToSqlRelShuttle.java, core/.../validate/OpenSearchSparkSqlDialect.java, core/.../validate/SqlOperatorTableProvider.java, core/.../validate/ValidationUtils.java
Adds end-to-end Rel→Sql validation with implicit casts; new validator/coercion rule sets; converters for SEMI/ANTI joins; shuttles for rewrites and literal normalization; dialect tweaks (liberal mode, IP cast). Review control flow, error handling (tolerant exceptions), and join handling.
Convertlets
core/.../validate/PplConvertletTable.java
Custom convertlets; IP-aware comparisons and ATAN remap. Validate coverage and operator mapping precedence.
Type Utilities & Factory
core/.../utils/OpenSearchTypeUtil.java, core/.../utils/OpenSearchTypeFactory.java, core/.../calcite/ExtendedRexBuilder.java, core/.../calcite/utils/CalciteToolsHelper.java
New utility for UDT/numeric/datetime checks; leastRestrictive override; arithmetic type derivation for mixed string/number; error chaining for WIDTH_BUCKET. Verify UDT conversions and nullability propagation.
Rel/Rex Visitors
core/.../calcite/CalciteRelNodeVisitor.java, core/.../calcite/CalciteRexNodeVisitor.java
Embed existing collations into window OVER clauses; simplify window function building. Check ordering semantics and planner determinism.
Function Registry & Operand Metadata Refactor
core/.../expression/function/PPLFuncImpTable.java, core/.../expression/function/UDFOperandMetadata.java, core/.../expression/function/PPLBuiltinOperators.java, core/.../calcite/utils/PPLOperandTypes.java
Switch to single-implementation registry (no per-signature lists); unify on OperandTypes and custom checkers; many operators converted to SqlFunction. Validate registration, arity/rules, and backward compatibility.
Removed Legacy Type Checking/Coercion
core/.../expression/function/PPLTypeChecker.java (removed), core/.../expression/function/CoercionUtils.java (removed), core/.../expression/function/CalciteFuncSignature.java (removed), core/.../test/.../CoercionUtilsTest.java (removed)
Eliminate bespoke type-coercion/signature layer in favor of Calcite validation. Confirm no remaining dependencies.
UDF Implementations (Collections)
core/.../CollectionUDF/*
Define operand metadata (ARRAY/MAP/FUNCTION/VARIADIC), rework type inference (e.g., Reduce/Transform). Verify bindings on SqlCallBinding and optional operands.
UDF Implementations (IP)
core/.../udf/ip/*
Custom operand checkers for IP functions (IP, CIDRMATCH, comparisons). Ensure OpenSearchTypeUtil.isIp coverage including OTHER acceptance where applicable.
Width Bucket/Binning
core/.../utils/binning/BinnableField.java, core/.../udf/binning/WidthBucketFunction.java
Switch to OpenSearchTypeUtil checks; return-type inference adjusted. Confirm timestamp/date handling.
API/Test Adjustments
api/.../UnifiedQueryTranspilerTest.java
Update dialect import and expected SQL (SAFE_CAST→equality). Aligns with new coercion.
Integration Tests (Java)
integ-test/src/test/java/...
New/updated tests (e.g., mixed-type BETWEEN/IN), YAML explain adoption, expected exception type changes, conditional skips. Review new flows relying on SAFE_CAST and YAML comparators.
Integration Tests (Expected YAML)
integ-test/src/test/resources/expectedOutput/calcite/*
Wide updates from JSON→YAML, plan reshaping (SAFE_CAST, EXPR_TIMESTAMP/DATE/TIME, reordered nodes, map literal formatting). Validate against new validator/coercion semantics.
Docs
docs/user/ppl/*
Example updates/temporarily ignored blocks; adjusted outputs reflecting new casting and formatting. Confirm accuracy with current behavior.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant QueryService
  participant RelBuilder as PPL RelNode
  participant RelToSql as PplRelToSqlNodeConverter
  participant Validator as PplValidator/TypeCoercion
  participant SqlToRel as PplSqlToRelConverter
  participant Optimizer as Planner/Rules
  participant OpenSearch as DataStore

  Client->>QueryService: executeWithCalcite/ explainWithCalcite(PPL)
  QueryService->>RelBuilder: build initial RelNode
  QueryService->>RelBuilder: apply PplRelToSqlRelShuttle (normalize literals)
  QueryService->>RelToSql: convert RelNode -> SqlNode
  QueryService->>Validator: validate SqlNode (implicit casts, UDT mapping)
  alt Validation tolerant failure
    Validator-->>QueryService: throws known issue
    QueryService-->>QueryService: fallback to original RelNode
  else Success
    Validator-->>QueryService: validated SqlNode
    QueryService->>SqlToRel: SqlNode -> RelNode (Spark SEMI/ANTI fixups)
  end
  QueryService->>Optimizer: optimize RelNode
  Optimizer->>OpenSearch: pushdown (PROJECT/FILTER/AGG/SORT)
  OpenSearch-->>QueryService: results/plan
  QueryService-->>Client: response

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • opensearch-project/sql#4924 — Also modifies QueryService’s Calcite execution path; overlaps with new validation/optimization flow.
  • opensearch-project/sql#4754 — Touches CalciteRelNodeVisitor; related to window/relational visitor changes present here.
  • opensearch-project/sql#4841 — Adjusts explain/filters with Calcite validation; overlaps expected plan updates (SAFE_CAST, YAML).

Suggested labels

backport 2.19-dev

Suggested reviewers

  • penghuo
  • LantaoJin
  • anirudha
  • ps48
  • ykmr1224
  • dai-chen
  • kavithacm
  • derek-ho

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.81% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Out of Scope Changes check ❓ Inconclusive Most changes focus on the Calcite migration (validators, coercion rules, converters, type utilities, operand metadata, and test updates), but several large public-API removals and registry/signature refactors are present. Confirm intent for breaking API removals/signature changes (e.g., PPLTypeChecker, CoercionUtils, CalciteFuncSignature, widespread SqlOperator→SqlFunction and registry shape changes); if intentional, document as breaking changes or split into a compatibility/major-version PR and add migration notes.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title concisely and accurately describes the primary change: switching to Calcite's validation system for type checking and coercion.
Description check ✅ Passed Description is detailed and directly related to the changeset, listing goals, approach, work items, and known limitations.
Linked Issues check ✅ Passed Implementation aligns with linked issues: reintroduces SqlNode validation, PplValidator, PplTypeCoercion/Rule, converters, and UDT handling to leverage Calcite coercion.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Dec 03 '25 07:12 coderabbitai[bot]

@coderabbitai generate docstrings

yuancu avatar Dec 23 '25 07:12 yuancu

✅ Actions performed

Initiated docstring generation; will generate only if new commits exist.

coderabbitai[bot] avatar Dec 23 '25 07:12 coderabbitai[bot]

[!NOTE] Docstrings generation - SUCCESS Generated docstrings for this pull request at https://github.com/opensearch-project/sql/pull/4989

coderabbitai[bot] avatar Dec 23 '25 07:12 coderabbitai[bot]