text wrapped at 78 characters ignoring body_width setting and partially ignoring -b option
I am using html2text version 2020.1.16 with Python 3.8.1. html2text wraps the text at 78 characters regardless of the setting of BODY_WIDTH in config.py or body_width in the parser. I set BODY_WIDTH = 0 in config.py, then:
html = 'This line is longer than seventy-eight characters. It seems to be getting wrapped with backslash n line breaks at seventy-eight characters regardless of the body_width setting in config.py or in the parser.' len(html) 228
plainmd = html2text.html2text(html) plainmd 'This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser.\n\n' len('This line is longer than seventy-eight characters. It seems to be getting') 77
parser = html2text.HTML2Text() parser.body_width = 0 parser.body_width 0 plainmd = html2text.html2text(html) plainmd 'This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser.\n\n'
parser.body_width = 22 parser.body_width 22 plainmd = html2text.html2text(html) plainmd 'This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser.\n\n'
parser.body_width = 99 parser.body_width 99 plainmd 'This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser.\n\n'
However, at the command line, it wraps on the screen at the value specified with -b, but still puts a newline character in after 78 characters:
$ cat /tmp/test.html This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser. $ $ wc -l /tmp/test.html; wc -c /tmp/test.html 1 /tmp/test.html 217 /tmp/test.html $ $ python3.8 -m html2text -b 0 /tmp/test.html This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser. $ $ echo 'This line is longer than seventy-eight characters. It seems to be getting' | wc -c 78 $ $ python3.8 -m html2text -b 22 /tmp/test.html This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy- eight characters regardless\nof the body_width setting in config.py or in the parser.
$ python3.8 -m html2text -b 99 /tmp/test.html This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless\nof the body_width setting in config.py or in the parser.
$
You are never using the parser with which you defined the body width...you are using directly html2text.html2text(html) so basically you called the function 3 times with 3 times the same default settings
In [1]: import html2text
In [2]: html2text.__version__
Out[2]: (2020, 1, 16)
In [3]: html = 'This line is longer than seventy-eight characters. It seems to be getting wrapped with backslash n line breaks at seventy-eight characters regardless of the body_width
...: setting in config.py or in the parser.'
In [4]: parser = html2text.HTML2Text()
In [5]: parser.body_width = 0
In [6]: parser.handle(html)
Out[6]: 'This line is longer than seventy-eight characters. It seems to be getting wrapped with backslash n line breaks at seventy-eight characters regardless of the body_width setting in config.py or in the parser.\n'
In [7]: parser = html2text.HTML2Text(bodywidth=0)
In [8]: parser.handle(html)
Out[8]: 'This line is longer than seventy-eight characters. It seems to be getting wrapped with backslash n line breaks at seventy-eight characters regardless of the body_width setting in config.py or in the parser.\n'
# This is getting wrapped as expected
In [9]: html2text.html2text(html)
Out[9]: 'This line is longer than seventy-eight characters. It seems to be getting\nwrapped with backslash n line breaks at seventy-eight characters regardless of\nthe body_width setting in config.py or in the parser.\n\n'