Support Pydantic Serialization for CloudPath
CloudPath's are supported in Pydantic BaseModel's but when you try to serialize them I get a PydanticSerializationError: Unable to serialize unknown type.
I think it would be an easy fix to add. Note that Path serialization is supported.
Reproducible code snippet:
from pydantic import BaseModel
from cloudpathlib import CloudPath
class SomeModel(BaseModel):
field_a: CloudPath
SomeModel(field_a="s3://bucket/key").model_dump_json()
Fix required, in the below function add a serialization parameter to no_info_after_validator_function:
https://github.com/drivendataorg/cloudpathlib/blob/28f1d943078517262d045ae50136edf27a5298a1/cloudpathlib/cloudpath.py#L1587-L1598
i.e.
return core_schema.no_info_after_validator_function(
cls.validate,
core_schema.any_schema(),
serialization=core_schema.plain_serializer_function_ser_schema(
lambda x: str(x),
return_schema=core_schema.str_schema(),
),
)
Easy workaround for now is to add a PlainSerializer:
from pydantic import PlainSerializer, BaseModel
from cloudpathlib import CloudPath
from typing import Annotated
class SomeModel(BaseModel):
field_a: Annotated[CloudPath, PlainSerializer(lambda x: str(x))]
SomeModel(field_a="s3://bucket/key").model_dump_json()
Let me know if you agree with adding this. Only just started using the library but I think it's a pretty easy fix so can probably help contribute.
That seems reasonable, thanks @mattijsdp.
One question: will library users still be able to override the str serialization if they want? For example, they may prefer getting the http url or a presigned url at serialization time rather than the string representation. If so, let's add an example to the docs with that as well.
@pjbull yes, you can customise serialization. I added an example to the docs. Let me know what you think.
Thanks @mattijsdp! This will be in the next release.
Thank you!