cel-go icon indicating copy to clipboard operation
cel-go copied to clipboard

Type check and pre-compile fixed regex strings

Open jpbetz opened this issue 3 years ago • 1 comments

Change

Currently regex patterns are compiled at runtime each time a matches() invocation is evaluated by the interpreter.

Instead, if the regex patterns is a string constant (e.g. x.matches('[0-9]+'))

  • Check the regex at compile time and return a type check error if the regex string is malformed (similar in spirit to https://github.com/google/cel-go/issues/359)
  • Pre-compile regex at CEL compile time (at least for programs where OptOptimize is enabled) and reuse it at runtime instead of recompiling it for each invocation

Example*

`x.matches('[0-9]+') // regex pattern is fixed, so type check it and pre-compile it at CEL compile time

x.matches(foo.pattern) // regex pattern is from a variable input so continue to compile it at runtime for each matches() invocation

Alternatives considered

I'm open to ideas here.

Based on what little I know of CEL, the approach that came to mind is:

  • Annotate Function Overloads to identify arguments that are regex pattern strings
  • Modify the type checker to look for regex pattern arguments where the arg expression is a const, compile the regex and report any errors
  • Modify the planner to drop in a new type of Interpretable for regex pattern argument exprs that holds the compiled regex. Also add a new ref.Val type that the new Interpretable can return that holds a reference to the compile regex in addition to the string constant the ref.Val represents. matches() can then inspect the ref.Val to see if it has a pre-compiled regex, and use if it is does (otherwise it can compile as usual)

jpbetz avatar Feb 25 '22 04:02 jpbetz

xref: https://github.com/kubernetes/kubernetes/pull/108312#discussion_r814461567

jpbetz avatar Feb 25 '22 16:02 jpbetz