pat codeblocks: add size to struct reg_code_blocks and fix leak
The first commit reworks some of the detail of how arrays are interpolated arrays into patterns which have code blocks, and fixes GH #16627. The second follow up commit fixes a small leak I spotted in that same area of code.
On Wed, May 08, 2024 at 04:24:44PM -0700, Hugo van der Sanden wrote:
In the first commit message, I don't understand the line
Overall, $pat has the same effect as qr/A1B2C3/.
$pat, after interpolation, when being compiled, expands to
qr/A (??{$x}) B (??{$x}) C (??{$y})/x
except that the $x's are different variables coming from different closures. When the pattern is executed, the various (??{$foo})'s are executed, and the various variable values are returned, interpreted as a sub-pattern and recompiled. The three variable values retuned are 1,2,3, so the run-time effect of the pattern is the same as is had been written qr/A1B2C3/.
The point of that line in the commit message was to emphasise that the two $'s have different values. Which part wasn't clear and needs expanding? Or should I just include the entirety of the paragraph above?
A minor point: all these tests involve regexps that match, having some that correctly fail to match would be good.
The main point of those particular tests was to ensure that each time a code block is called, the right code block is called (the bug meant that sometimes the wrong code block was stored at particular slot). A successful match implies that every code block which was called had to have been the right code block (because each is arranged to be a closure that returns a specific unique value). There are many, many ways for the match to fail, and only one way for the match to succeed. So I'm not sure how having a non-matching pattern helps, nor can I think of a suitable set of such failing test patterns. I'm open to suggestions though.
-- This email is confidential, and now that you have read it you are legally obliged to shoot yourself. Or shoot a lawyer, if you prefer. If you have received this email in error, place it in its original wrapping and return for a full refund. By opening this email, you accept that Elvis lives.
The three variable values returned are 1,2,3
I somehow missed that "A1B2C3" was the literal pattern: for some reason it didn't occur to me to backtrack beyond the current paragraph to find the relevant context. I think this is probably just me overlooking something that should have been obvious.
The main point of those particular tests was to ensure that each time a code block is called, the right code block is called
Well, all we know is that a match succeeded, not that the right code block was called - in principle we could end up with the pattern being changed to /.*/ and all the tests would pass.
I don't think it needs a lot, but just showing that a single example like qr/^(??{$A})$r$/ ("code in array 10") does not match some nearby strings like "AE", "Aee", "A=ee" would be something. We should also verify that lack of use re qw{eval} causes the right error in this case, if we don't already have that.
I've just pushed a forced update.
On Mon, May 13, 2024 at 03:11:46PM -0700, Hugo van der Sanden wrote:
I somehow missed that "A1B2C3" was the literal pattern:
I've expanded the commit message a bit to help.
Well, all we know is that a match succeeded, not that the right code block was called - in principle we could end up with the pattern being changed to
/.*/and all the tests would pass.
Some negative tested added.
We should also verify that lack of
use re qw{eval}causes the right error in this case, if we don't already have that.
A couple of such tests added.
-- "Emacs isn't a bad OS once you get used to it. It just lacks a decent editor."