scanner: consistently +2 to realloc size for EOB
We expect to have room for 2 EOB chars, but in this particular realloc path this only works because yy_n_chars tends to be large enough for 1.5x its value to have room for our 2 EOB chars.
That said, clang-6's static analyzer does produce a warning about this ("Use of zero-allocated memory", meaning realloc with size=0) because it goes by the assumption that yy_n_chars and number_to_move could be 0.
While this is somewhat bogus, it does make sense to just put the "+ 2" here and be done with it. AFAICS, this is the correct thing to do here after all.
Signed-off-by: David Lamparter [email protected]
What is EOB? And what is your use case for needing that thing?
@Explorer09 EOB = END_OF_BUFFER, please refer to the code 10 lines below the patched location:
YY_CURRENT_BUFFER_LVALUE->yy_ch_buf[YY_G(yy_n_chars)] = YY_END_OF_BUFFER_CHAR;
YY_CURRENT_BUFFER_LVALUE->yy_ch_buf[YY_G(yy_n_chars) + 1] = YY_END_OF_BUFFER_CHAR;
https://github.com/westes/flex/pull/380/files#diff-d7ef57bcc5501f4ad873f97c94d4f759R1710
Anyone around? @Explorer09? @westes?
Did you have a specific question?
@westes yes, is there anything for me to work on regarding this PR? It's not clear to me whether I've sufficiently answered @Explorer09's questions and/or I need to do something more/else.
@eqvinox I did not see any problem when reading your patch, but I'm not sure if this thing can be tested, to check if the bug is really fixed, and to prevent regression in the future. The current skeleton code is not neat to read (someone would have to clean it up eventually), and so I can't confirm the bug without a demonstration or test case (either that it should be obvious to read the bug in the code logic, or there should be a test case to demonstrate that the bug exists). That was why I remained silent after that.
I now believe this is a potential security issue in flex-generated lexers. If yy_n_chars is 0 (initial state) and number_to_move is 1 (short input), there will be both:
- an underflow on
yy_buf_sizeto -1 - a write of 1 byte past the end of the allocated buffer in
YY_CURRENT_BUFFER_LVALUE->yy_ch_buf[YY_G(yy_n_chars) + 1] = YY_END_OF_BUFFER_CHAR;
Also, for additional context, this was mis-fixed in 09697fb269501dc0f3467e77bd4297d785c72aa8 (the subtraction is correct but new_size should've been increased at the same time) and also refer to this line:
https://github.com/westes/flex/blob/master/src/flex.skl#L1651
which correctly allocates the buffer with 2 extra bytes to hold YY_END_OF_BUFFER_CHAR.
Whether this is exploitable I don't know as I lack the relevant skills.
@westes I would strongly advise to push out a point release with this fix as soon as possible.
Please rebase and resolve conflicts.
Please rebase and resolve conflicts.
sure, here you go @westes