haystack icon indicating copy to clipboard operation
haystack copied to clipboard

`ImageToText` & `AnswerToImage`

Open ZanSara opened this issue 3 years ago • 0 comments

[Part of #2418]

What A node that takes a list of paths to images and captions them. The captions are then stored as Documents, with the path to the image in their metadata. The captions will be processed as regular documents, so no radical changes are expected in the core of the framework (yet).

Why ImageToText could be a nice test of how Haystack could take images as input in indexing pipelines, and help opening the path for image support in general.

Expected results

  • A generalizable way to take images as input in Haystack and distinguish them from text documents
    • Probably FileTypeClassifier will suffice, but it could be improved by not simply routing files by extension but also by "type", i.e. is this a file that will likely contain text, or rather on that is likely to be an image.
  • Special primitives for images (ImageDocument?)
  • A small node or utility function for the query side to retrieve the captioned image downstream (AnswerToImage?).

These changes should be added as separate PRs.

ZanSara avatar Apr 21 '22 15:04 ZanSara