haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Migrate to Pydantic V2

Open anakin87 opened this issue 2 years ago • 1 comments

Pydantic V2 was recently released.

The new version was not compatible with Haystack: here you can see the errors that lead to pinning pydantic<2 and releasing Haystack 1.18.1.

We should migrate to V2 for several reasons, including:

  • Performance (Pydantic V2 is 5-50x faster than Pydantic V1)
  • Safety & maintainability

We see how painful it is :smiley: and what changes migration requires...

anakin87 avatar Jul 10 '23 09:07 anakin87

The migration turned out to be more difficult and longer than I thought. Talking with @ZanSara, we decided that we can migrate to Pydantic V2 in Haystack v2.

Learnings

  • :exclamation: in V2, Pydantic dataclasses generate their own constructor and don't support a custom __init__ method. (https://github.com/pydantic/pydantic/issues/5298) Currently, we use Pydantic dataclasses with custom __init__ in Document and Label.
  • however, migrating the data model (schema.py) was feasible
  • the JSON schema generation and validation proved hard to migrate, as we are using several Pydantic abstractions that have been removed/renamed, along with several python internals.
  • The bump-pydantic official tool did almost nothing (effective) on our codebase

anakin87 avatar Jul 11 '23 15:07 anakin87

When is Haystack V2 scheduled for?

mjspeck avatar Sep 28 '23 20:09 mjspeck

Hey @mjspeck, thanks for your interest...

We do not yet have an official date for Haystack 2.0, but we are sharing all the progress in this discussion and also on Discord. Feel free to participate!

anakin87 avatar Sep 28 '23 22:09 anakin87

I appreciate that, but since 2.0 seems far away, I'm hoping that migration to pydantic 2.0 could happen before it. I know you said that bump-pydantic didn't help, but did you try updating pydantic and using the v1 submodule in your exploration? You should be able to stay with the v1 API by just adding v1 to all your imports (e.g. from pydantic.v1 import BaseModel). That would allow users to add haystack to environments that require v2 (which is the case for a project I'm working on), while requiring minimal code changes on your end.

mjspeck avatar Sep 28 '23 23:09 mjspeck

When I tried, the option from pydantic.v1 was not available (apparently).

It would be nice to test it but I don't know when it is feasible for us.

If you want to give it a try and open a PR if it works, you are welcome. If it works, the best option is probably to have a conditional import, so as not to break things for those still using Pydantic 1.

Related discussion: https://github.com/tiangolo/fastapi/discussions/9966

anakin87 avatar Sep 28 '23 23:09 anakin87

@silvanocerza please share your opinion on this...

anakin87 avatar Sep 29 '23 16:09 anakin87

I believe this is not a top priority as of now. We're focusing on releasing Haystack 2.x by the end of the year and we don't have enough resources to update 1.x to Pydantic v2.

It's not as easy task as that will require updating tons of libraries, and that can come with a new set of bugs and problems that we'd have to fix.

I would very much prefer if we stick with Pydantic v1 to avoid unecessary troubles now.

silvanocerza avatar Sep 29 '23 17:09 silvanocerza