html2openxml icon indicating copy to clipboard operation
html2openxml copied to clipboard

Unorder list transformed into order list

Open ericadcg opened this issue 5 years ago • 8 comments

Hello. I have an unorder list (<ul> ) in my html and it's getting turned into an ordered list when the word is generated.

I checked the code and both tags <ul> and <ol> are using the ProcessNumberingList and BeginList. I have tried to make a new BeginList to use <ul> instead of <ol> but it did not work properly.

Any ideas?

ericadcg avatar Oct 02 '20 11:10 ericadcg

Hi,

I'm having exactly the same problem...

@onizet any ideas?

Thanks!

nascimento3 avatar Oct 06 '20 08:10 nascimento3

Do you use a fresh new document or insert into an existing one? I ran the Demo project and do not see any issues regarding <ul>. Could you post more details for troubleshooting?

onizet avatar Oct 06 '20 21:10 onizet

Thank you for your reply. The problem happens when there is a list inside a list, example:

<table>
    <tbody>
		<tr>
			<th>
			Header
			</th>
		</tr>
		<tr>
			<td> 
				<ul>
					<li>This is a test item - First List
						<ul>
							<li>Second List</li>
						</ul>
					</li>
				</ul>
			</td>
		</tr>	
	</tbody>
</table>

If I don't have the html shown above, everything is ok. But if I do, that list is ok, but all the following unorded lists are shown as orderd. Example:

<table>
    <tbody>
		<tr>
			<th>
			Header
			</th>
		</tr>
		<tr>
			<td> 
				<ul>
					<li>This is a test item - First List
						<ul>
							<li>Second List</li>
						</ul>
					</li>
				</ul>
			</td>
		</tr>	
	</tbody>
</table>


<table>
  <tbody>
	<tr>
		<td>Header 2</td>
	</tr>
    <tr>
		<td>
			<ul>
				<li>Test</li>
			</ul>
			<table>
				<tr>
					<td style='padding-left: 30px;'>Test 2 </td>
				</tr>
			</table>
		</td>
	</tr>
  </tbody>
</table>

Here is an image of the word output of the html from above:

WordOutputExample

ericadcg avatar Oct 07 '20 13:10 ericadcg

Here is a simpler example (with less tables involved):

<html>
    <body>
<p>Paragraph<p>
<ul>
	<li> This is a test item </li>
	<ul>
		<li> Second List</li>
	</ul>	
</ul>


<p>Paragraph 2</p>
<ul>
	<li>New ul list - item 1</li>
	<li>item 2</li>
</ul>
	
</body>
</html>

And this is the word output for the html above: WordOutputExample2

Also, I'm using a fresh new document.

Thank you again!

ericadcg avatar Oct 07 '20 13:10 ericadcg

In NumberingListStyleCollection

possible fix can be following:

    public void EndList(bool popInstances = true)
        {
            levelDepth--;

            if (levelDepth > 0 && popInstances)
                numInstances.Pop();  // decrement for nested list

            firstItem = true;
        }

dhavalgajera avatar Sep 08 '21 12:09 dhavalgajera

I ran into this issue as well throwing together a couple different lists and nested lists to test out the HtmlConverter. This seems to result in all lists being styled as ordered lists for some reason. Maybe this is a different bug, but sounds like maybe it is the same issue. Here is the example HTML and an example resulting paragraph. Thanks for the work on this library.

<ul class="alternate" type="square">\n\t
  <li>dfdf\n\t
    <ul class="alternate" type="square">\n\t\t
      <li>dfdfa</li>\n\t
    </ul>\n\t
  </li>\n\t
  <li>dfd</li>\n
</ul>
<br/>\n
<ul>\n\t
  <li>foof\n\t
    <ul>\n\t\t
      <li>fkfjf</li>\n\t
    </ul>\n\t
  </li>\n\t
  <li>adsf</li>\n
</ul>
<br/>\n
<ol>\n\t
  <li>num 1</li>\n\t
  <li>num 2</li>\n
</ol>\n
<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
  <w:pPr>
	<w:pStyle w:val="ListParagraph"/>
	<w:numPr>
		<w:ilvl w:val="0"/>
		<w:numId w:val="2"/>
	</w:numPr>
  </w:pPr>
  <w:r>
	<w:t xml:space="preserve">dfdf</w:t>
  </w:r>
</w:p>

sbowler avatar Sep 15 '21 23:09 sbowler

So I'm realizing that I think this has to do with me using the converter to convert HTML and then inject it into another document. I think the document is then missing the number styling id that the parser tries to add to the other document and defaults to a numbered list or something.

Update: I refactored the code and now using the final documents main part for the converting so it should be able to adding the correct matching styling and other items to the document. However, the document and list formatting still look the same with the wrong numbering format so..

sbowler avatar Sep 16 '21 15:09 sbowler

I finally got this in a state that seems to be working properly. Here is the gist of the code if it helps anyone having similar issues. In my case I was looking for certain things and then wanting to replace that paragraph with the HTML converted stuff. I've simplified it a bit for more generic code.

using (WordprocessingDocument package = WordprocessingDocument.Open(outputFilePath, true))
{
  MainDocumentPart mainPart = package.MainDocumentPart;
  if (mainPart == null)
  {
    // You may just want to create an empty document here for your case
    throw new Exception("Document appears to be empty");
  }

  HtmlConverter converter = new HtmlConverter(mainPart);
  foreach (var paragraph in mainPart.Document.Body.Descendants<DocumentFormat.OpenXml.Wordprocessing.Paragraph>())
  {
    if (paragraph.InnerText.Contains("FOO"))
    {
      foreach (var node in converter.Parse(foo.HtmlStep).Reverse())
      {
        paragraph.InsertAfterSelf(new Paragraph(node.OuterXml));
      }

      paragraph.Remove();
    }
  }

  mainPart.Document.Save();
}

sbowler avatar Sep 17 '21 18:09 sbowler