Do we need to escape search string as it's used in regexp? Wondering what's the result of `contains("abcdefg", ".*")`
Do we need to escape search string as it's used in regexp? Wondering what's the result of `contains("abcdefg", ".*")`
Originally posted by @waynexia in https://github.com/apache/datafusion/pull/10879#discussion_r1635767599
cc @Lordworms
Sorry for the late review since I was busy this week. In the beginning, I was just trying to keep the same format as other ScalarUDF which utilize arrow-rs methods to implement functionality so I just chose arrow::reglike. I can fix it to use str.contains
I was just trying to keep the same format as other ScalarUDF which utilize arrow-rs methods to implement functionality so I just chose arrow::reglike.
Makes sense. I think in this case, however, the function shouldn't actually have regexp support, so it would be better to use str.contains
I can fix it to use str.contains
Thank you!
Thanks @Lordworms. I'm thinking about @alamb's idea https://github.com/apache/datafusion/pull/10879#discussion_r1636789004, which only implements contains as a placeholder for translating/planning. And it would finally become some other thing like LIKE after the optimization phase. A similar thing is Expr::Wildcard, it's to reflect * from SQL but doesn't have a corresponding physical expr.
One benefit of using LIKE is that it already has a highly optimized arrow implementation (as in will actually use substring if the patttern looks like substr% etc).
I'll refactor it to use LIKE