pydantic-xml icon indicating copy to clipboard operation
pydantic-xml copied to clipboard

How to set the DOCTYPE on an exported model?

Open dromer opened this issue 10 months ago • 3 comments

It is very common, and often mandatory, to set the DOCTYPE in the header of the xml file: https://en.wikipedia.org/wiki/Document_type_declaration

I couldn't find anywhere in the documentation on how we can set this information. Is this possible or is this now a feature request?

I assume this would be specific for to_xml() output

dromer avatar Apr 09 '25 09:04 dromer

For now I solved it by loading the resulting xml into lxml etree and setting .docinfo.system_url, but it would be great if we can do this directly on the model or as an argument on to_xml()

dromer avatar Apr 09 '25 11:04 dromer

@dromer Hi,

The library is supposed to support lxml and std xml.etree backend as well. Unfortunately xml.etree doesn't support docinfo so I can't see any universal way to implement that feature.

The simplest way to support that is to override to_xml method if you use lxml only:

from typing import Any, Optional, Union

import pydantic_xml as pxml
from lxml import etree


class CustomXmlModel(pxml.BaseXmlModel):
    def to_xml(
            self,
            *,
            skip_empty: bool = False,
            exclude_none: bool = False,
            exclude_unset: bool = False,
            doc_public_id: Optional[str] = None,
            doc_system_url: Optional[str] = None,
            **kwargs: Any,
    ) -> Union[str, bytes]:
        tree = etree.ElementTree(self.to_xml_tree(skip_empty=skip_empty, exclude_none=exclude_none, exclude_unset=exclude_unset))
        tree.docinfo.public_id = doc_public_id
        tree.docinfo.system_url = doc_system_url

        return etree.tostring(tree, **kwargs)


class Model(CustomXmlModel):
    attr: str = pxml.attr()
    element: str = pxml.element()



model = Model(attr="attr value", element="element value")
xml = model.to_xml(
    doc_public_id='-//W3C//DTD XHTML 1.1//EN',
    doc_system_url='http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd',
    pretty_print=True,
)

print(xml.decode())

<!DOCTYPE Model PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<Model attr="attr value">
  <element>element value</element>
</Model>

dapper91 avatar Apr 19 '25 16:04 dapper91

@dapper91 thank you for the suggestion!

I'll give this a try next week :)

dromer avatar Apr 19 '25 16:04 dromer