Not all images are serialized

Open MikeHibbert opened this issue 6 years ago • 0 comments

On the page https://www.bbc.co.uk/news/world-africa-51063149

I ran a test and the logo "NEWS" at the top left of the page is not turned into a data url although the picture on the page is.

I did the following:

import htmlark
packed_html = htmlark.convert_page('https://www.bbc.co.uk/news/world-africa-51063149', ignore_errors=True) 

f = open('htmlark_test.html', 'w') 
f.write(packed_html) 
f.close()

I looked at the documentation and its not totally clear how I can try different parsers like html5lib etc, is it that I'm not using the correct parser here or something else?

Jan 10 '20 19:01 MikeHibbert