docx2python icon indicating copy to clipboard operation
docx2python copied to clipboard

how to get project number in docx

Open mashagua opened this issue 1 year ago • 1 comments

when i use from docx2python import docx2python doc_result = docx2python(file_path) doc_result.body[2][0] and the result like this: [[14)\t\t\t其他', '', '\t1)\t\t\t税费. 除非双方另有明确约定,由任何机关/个人征收的与该商铺使用有关的任何种类的税金、行政规费、收费、费用等应由甲方承担,除非适用法律另有规定。', '',]], but the real content is this: 14. 其他

14.1 税费. 除非双方另有明确约定,由任何机关/个人征收的与该商铺使用有关的任何种类的税金、行政规费、收费、费用等应由甲方承担,除非适用法律另有规定。 how to fix this error and get the real content not the '\t1)\t\t\t税费. 除非双方另有明确约定,由任何机关/个人征收的与该商铺使用有关的任何种类的税金、行政规费、收费、费用等应由甲方承担,除非适用法律另有规定。'"

mashagua avatar Jul 15 '24 07:07 mashagua

As far as I can tell, the content is the same. The difference is how docx2python converts Word's numbering and indentation formats into plain text.

I will expose numId (index to a multi-level list), ilvl (indentation depth of a numbered-list item), and number (numbering value of a numbered-list item) if docx2python v3. This functionality should be available in two or three weeks and will allow you (with a little scripting) to keep track of lists and sublists more easily.

ShayHill avatar Jul 15 '24 19:07 ShayHill

This is now fully implemented in Docx2Python 3.0.0. See README.

ShayHill avatar Jul 27 '24 22:07 ShayHill