GetInnerText is behaving different from HTML innerText for tables
Prerequisites
- [X] Can you reproduce the problem in a MWE?
- [X] Are you running the latest version of AngleSharp.Css?
- [X] Did you check the FAQs to see if that helps you?
- [X] Are you reporting to the correct repository? (there are multiple AngleSharp libraries, e.g.,
AngleSharp.Xmlfor Xml support) - [X] Did you perform a search in the issues?
Description
When using GetInnerText the result returned misses linebreaks for the table rows.
If I use HTMLs "innerText", the linebreaks after each tablerow are correct.
I also tried to add "
" after a "" element, but it is ignored. Everything between "" and "" seems to be ignored.
Thanks a lot for the awesome project!
Steps to Reproduce
Setup simple Anglesharp example, config like the following:
IConfiguration config = Configuration
.Default
.WithCss(new CssParserOptions
{
IsToleratingInvalidSelectors = true,
IsIncludingUnknownDeclarations = true,
IsIncludingUnknownRules = true,
})
.WithRenderDevice(new DefaultRenderDevice
{
DeviceHeight = 768,
DeviceWidth = 1024,
});
Then parse the following HTML:
<html>
<head>
</head>
<body>
<h2>Test</h2>
<table>
<tbody>
<tr>
</tr>
<tr>
<td>Titel: </td>
<td>Herr</td>
</tr>
<tr>
<td>Vorname: </td>
<td>Horst</td>
</tr>
<tr>
<td>Nachname: </td>
<td>Hammer</td>
</tr>
</tbody>
</table>
</body>
</html>
Expected Behavior
The result when going with document.body.innerText from Chrome devtools console:
Test
Titel: Herr
Vorname: Horst
Nachname: Hammer
Actual Behavior
The result from anglesharp GetInnerText:
Test
Titel: Herr Vorname: Horst Nachname: Hammer
Possible Solution / Known Workarounds
No response
The outcome to preserve the table is definitely nice - I don't think we (at the moment) respect the display set to table.
This could certainly be improved (but I am not sure if this is / should be classified as a bug - IIRC we pretty much follow the spec).
Oh ok, since other browsers deal differently with tables, I thought it was out of spec.
But of course, feel free to change this to improvement or feature request or something.