phobos
phobos copied to clipboard
Std.regex: incorrect values of look-around captures
The next example, checked with dmd 2.109.1, uses a combination of lookbehind and lookahead assertions:
import std.stdio;
import std.regex;
void main()
{
auto re = regex(r"(?<=(..)(?=(..)))..cde");
auto captures = std.regex.matchFirst("12345abcde", re);
writeln(captures[0]); // "abcde" as expected
writeln(captures[1]); // "45" as expected
writeln(captures[2]); // nothing, but "ab" is expected
}
The value of captures[2] should be “ab”, but it is null. (Other prominent engines, in various languages, give correct results).
According to documentation, the std.regex library should support “arbitrary length and complexity lookbehind, including lookahead in lookbehind and vice-versa”.
The modified patterns, such as (?<=(..))(?=(..))..cde, seem to work correctly.
Verified with regex101, we’re apparently the outlier here.