Allow attributes to be collections
Currently collection types are not allowed for attributes. This is logical for the default serializers, but the user might overwrite default behavior with a custom validator that might return a collection. For example:
from typing import Annotated, Any
import pydantic_xml as pxml
from pydantic import BeforeValidator
from pydantic.functional_serializers import PlainSerializer
def validate_space_separated_attr(value: str) -> list[str]:
return value.split(" ")
def serialize_space_separated_attr(values: list[Any]) -> str:
return " ".join(str(x) for x in values)
SpaceSeparatedValueListAttr = Annotated[list[str], BeforeValidator(validate_space_separated_attr), PlainSerializer(serialize_space_separated_attr)]
class Person(pxml.BaseXmlModel):
children: SpaceSeparatedValueListAttr = pxml.attr()
name: str = pxml.element()
doc = """
<Person children="Bob Eve">
<name>Alice</name>
</Person>
"""
alice = Person.from_xml(doc)
print(alice.children) # prints ['Bob', 'Eve']
print(alice.to_xml())
Instead of disallowing outright, this change parses these attributes as a string and leave it up to the user. This might not be a good final solution, it would be nicer to check if a custom validator logic is present and error when it is not. However, I do not see an easy way to add such a check.
What do you think? I at least got stuck parsing XML documents that contain space-separated lists in attributes.
@jorants Hi,
For now you can define custom schema:
from typing import Annotated
import pydantic_xml as pxml
from pydantic_core import core_schema as cs
class SpaceSeparatedValueListSchema:
@classmethod
def __get_pydantic_core_schema__(cls, source_type, handler):
schema = cs.no_info_after_validator_function(lambda val: val.split(' '), cs.str_schema())
serialization = cs.plain_serializer_function_ser_schema(lambda lst: ' '.join(lst))
return cs.json_or_python_schema(json_schema=schema, python_schema=schema, serialization=serialization)
class Person(pxml.BaseXmlModel):
children: Annotated[list[str], SpaceSeparatedValueListSchema] = pxml.attr()
name: str = pxml.element()
doc = """
<Person children="Bob Eve">
<name>Alice</name>
</Person>
"""
alice = Person.from_xml(doc)
print(alice.children) # prints ['Bob', 'Eve']
print(alice.to_xml()) # prints b'<Person children="Bob Eve"><name>Alice</name></Person>'
I suppose this collection attribute implementation is not very intuitive since it only works if validator and serializer are provided. In my opinion such code should work too:
class Person(pxml.BaseXmlModel):
children: list[str] = pxml.attr()
name: str = pxml.element()
but it fails with a misleading error.