Crash and out of bounds with some test case
With some test case and "-fsanitize=undefined", I get
Area.cpp:529:17: runtime error: index 5 out of bounds for type 'char [5]'
Area.cpp:529:19: runtime error: store to address 0x7fff43794c70 with insufficient space for an object of type 'char'
0x7fff43794c70: note: pointer points here
ff ff ff 00 39 37 bc 4e 4d ee fa bd 30 bc fd 01 00 00 00 00 40 4d 79 43 ff 7f 00 00 69 2d 38 eb
At the moment this only seldom occurs in a cronjob environment, so I have no testcase just now.
However, when compiling html2text with "-fanalyzer -fno-lto" this gives some lengthy ~10000 lines output with several scary looking issues, I can attach this output to the report if desired.
Yeah, this is an unbounded loop, which is a bit silly when I think of it now.
Not sure this is exactly the same issue, but produced from the same data source as the above crash. The following test case gives each time a different content in the bracket after the word "Organization":
wget -nv --no-cache "https://tgftp.nws.noaa.gov/tgstatus/" -O gaga.html
cat gaga.html | html2text -nobs > a.txt
cat gaga.html | html2text -nobs > b.txt
diff a.txt b.txt
20c20
< Site_Map News Organization [`?? ] [Search]
---
> Site_Map News Organization [`/I ] [Search]
Oh, I just see I had cut off some part of the error message in the original report. The full message is
Area.cpp:529:17: runtime error: index 5 out of bounds for type 'char [5]'
Area.cpp:529:19: runtime error: store to address 0x7fffca942ee0 with insufficient space for an object of type 'char'
0x7fffca942ee0: note: pointer points here
ff ff ff 00 39 37 bc 4e 4d ee fa bd 30 fc 41 02 00 00 00 00 b0 2f 94 ca ff 7f 00 00 69 2d 38 eb
^
Area.cpp:527:15: runtime error: shift exponent -1 is negative
Do you have a testfile for this, perhaps. I just fixed a bunch of use-after-free issues, maybe those help to resolve this.
this seems fixed on my end