Migrate to Pydantic V2
Pydantic V2 was recently released.
The new version was not compatible with Haystack:
here you can see the errors that lead to pinning pydantic<2 and releasing Haystack 1.18.1.
We should migrate to V2 for several reasons, including:
- Performance (Pydantic V2 is 5-50x faster than Pydantic V1)
- Safety & maintainability
We see how painful it is :smiley: and what changes migration requires...
The migration turned out to be more difficult and longer than I thought. Talking with @ZanSara, we decided that we can migrate to Pydantic V2 in Haystack v2.
Learnings
- :exclamation: in V2, Pydantic dataclasses generate their own constructor and don't support a custom
__init__method. (https://github.com/pydantic/pydantic/issues/5298) Currently, we use Pydantic dataclasses with custom__init__inDocumentandLabel. - however, migrating the data model (schema.py) was feasible
- the JSON schema generation and validation proved hard to migrate, as we are using several Pydantic abstractions that have been removed/renamed, along with several python internals.
- The bump-pydantic official tool did almost nothing (effective) on our codebase
When is Haystack V2 scheduled for?
Hey @mjspeck, thanks for your interest...
We do not yet have an official date for Haystack 2.0, but we are sharing all the progress in this discussion and also on Discord. Feel free to participate!
I appreciate that, but since 2.0 seems far away, I'm hoping that migration to pydantic 2.0 could happen before it. I know you said that bump-pydantic didn't help, but did you try updating pydantic and using the v1 submodule in your exploration? You should be able to stay with the v1 API by just adding v1 to all your imports (e.g. from pydantic.v1 import BaseModel). That would allow users to add haystack to environments that require v2 (which is the case for a project I'm working on), while requiring minimal code changes on your end.
When I tried, the option from pydantic.v1 was not available (apparently).
It would be nice to test it but I don't know when it is feasible for us.
If you want to give it a try and open a PR if it works, you are welcome. If it works, the best option is probably to have a conditional import, so as not to break things for those still using Pydantic 1.
Related discussion: https://github.com/tiangolo/fastapi/discussions/9966
@silvanocerza please share your opinion on this...
I believe this is not a top priority as of now. We're focusing on releasing Haystack 2.x by the end of the year and we don't have enough resources to update 1.x to Pydantic v2.
It's not as easy task as that will require updating tons of libraries, and that can come with a new set of bugs and problems that we'd have to fix.
I would very much prefer if we stick with Pydantic v1 to avoid unecessary troubles now.