Make it possible to use RE2J instead of java.util.regexp
As I See it works if only change imports so we need to create a factory for Pattern/Matcher and adaptors
can you provide more information about the purpose of this feature request?
are you concerned about the runtime performance of matching user-agent strings? or is it about reducing start up time? or does the RE2J support regexp patterns that java.util.regexp doesn't support?
I'm a little hesitant to make uap-java have a hard dependency on re2j since it would require all users to pull in another lib (which may have its own transitive dependencies... although honestly I haven't looked that deep to see if re2j depends on anything).
how would you envision this working? would it be like a java service provider/implementation... whereby the user adds re2j to the classpath and the regexp engine is specified by name at runtime? that might get a little complicated because it would likely require a wrapper around re2j that follows the java service provider spec so it could be plugged in.
Hi I'm concerned about the runtime performance of matching user-agent strings. the regular expression syntax accepted by RE2 is a subset of that accepted by PCRE. I believe your regexp's are not using unsupported features of RE2. Unlike PCRE it has o(n) validation/search time (i.e. each symbol is checking only once). I think creating some interface facade for PCRE and RE2 will be enough.
Page https://swtch.com/~rsc/regexp/regexp3.html#caveats describes sets of features which are not supported (lookahead or lookbehind assertions, backreferences, atomic grouping operators (?>...) and ++)
The main goal for developing it is that RE2 provides stronger guarantees on execution time than and enables high-level analyses that would be difficult or impossible with ad hoc implementations
Hm I see
Object.keys(regexes).forEach(function (parser) {
suite(`no reverse lookup in ${parser}`, function () {
regexes[parser].forEach(function(item) {
test(item.regex, function () {
if (/\(\?<[!=]/.test(item.regex)) {
assert.ok(false, 'go parser does not support regex lookbehind. See https://github.com/google/re2/wiki/Syntax')
}
if (/\(\?[!=]/.test(item.regex)) {
assert.ok(false, 'go parser does not support regex lookahead. See https://github.com/google/re2/wiki/Syntax')
}
})
})
})
})
Does it mean that RE2 is already implemented?
Does it mean that RE2 is already implemented?
the code you referenced is a javascript unit test in the other repo named uap-core. I don’t know why they’re checking for entries that are unsupported by the go runtime.
I did play a few years ago with different regex libraries using JMH to check performance. It might be interesting to compare against the regexes found in the patterns database. See https://github.com/fbacchella/RegexPerf for code.