datafusion
datafusion copied to clipboard
Add library user guide for extending SQL syntax
Which issue does this PR close?
- Closes #19087.
Rationale for this change
As noted in the issue, the RelationPlanner API (added in #17843) along with ExprPlanner and TypePlanner allows DataFusion users to extend SQL in powerful ways. However, these extension points are not well documented and may be difficult for users to discover.
What changes are included in this PR?
Adds a new Library User Guide page (extending-sql.md) that documents how to extend DataFusion's SQL syntax:
- Introduction explaining why SQL extensibility matters (different SQL dialects, custom operators)
- Architecture overview showing how SQL flows through the planner extension points
-
Extension points with examples for each:
-
ExprPlanner: Custom expressions and operators (e.g.,->for JSON) -
TypePlanner: Custom SQL data types (e.g.,DATETIMEwith precision) -
RelationPlanner: Custom FROM clause elements (TABLESAMPLE, PIVOT/UNPIVOT)
-
-
Implementation strategies for RelationPlanner:
- Rewrite to standard SQL (PIVOT/UNPIVOT example)
- Custom logical and physical nodes (TABLESAMPLE example)
-
Links to complete examples in
datafusion-examples/examples/relation_planner/
Also adds a cross-reference from the existing ExprPlanner section in adding-udfs.md to the new comprehensive guide.
Are these changes tested?
This is documentation only. The code examples reference existing tested examples in datafusion-examples/.
Are there any user-facing changes?
Yes, new documentation is added to help users extend DataFusion's SQL syntax.