cassandra icon indicating copy to clipboard operation
cassandra copied to clipboard

CASSANDRA-19341: Refactor Relation and Restriction hierachy

Open blerer opened this issue 1 year ago • 0 comments

The goal of the patch is to simply the code surrounding the WHERE clause predicates (called internally restrictions) and made that code easier to extend. For example adding new operators or new type of column expressions require today far more work than should be needed and is error prone as the code need to be modified in different places (Relation, Operator, Restriction classes, StatementRestrictions, ...). The proposed patch should limit those changes to Operator for the addition of new operators and to ColumnExpressions if new expressions (like UDT fields for example) need to be implemented.

The idea behind the patch is that operators can be classified in 3 categories:

  1. Operators that select one or multiple specific values (= and IN)
  2. Operators that select ranges of values (currently: >, >= ,< and <=. In the future != and NOT IN)
  3. Operator that need to be apply to the value for us to know if the value match the predicate (CONTAINS, LIKE, ...)

The combination of operators from type 2 is always producing a set of ranges and the combination of operators of type 1 and 2 is always selecting specific values or none. Operator of type 3 always need an index or filtering.

With that idea in mind it is possible to merge restrictions in a generic way and to avoid to have to hard code that logic for every operator.

The other idea of the patch is to encapsulate the different type of columns expressions (single column, multi-columns, token and map element) into a separate class rather than have a sub-class for each type of expression.

To patch make the following changes:

  • Removes SingleColumnRestriction, MultiColumnRestriction and TokenRestriction. Which are now replaced by 3 new classes:
  1. ColumnsExpression which represent the 4 type of expression supported so far (single column, multi-column, map element and partition key token)
  2. SimpleRestriction which represent a single predicate composed of a column expression, an operator and one or multiple values (for example s CONTAINS 10 or token(pk) > ?)
  3. MergedRestriction which represent multiple predicates on the same column expression (for example: s CONTAINS 10 AND s CONTAINS 11 or token(pk) > ? AND token(pk) <= ?) Those new classes push some of the logic to the Operator class that has a new API that define how operator are handled by query preparation and query execution.
  • Removes SingleColumnRelation, MultiColumnRelation and TokenRelation. Those classes are all replaced by the Relation class, which stop being abstract, and by pushing the validation logic into ColumnsExpression and Operator.

blerer avatar Feb 09 '24 09:02 blerer