Feature: support for multimodal models

Open massi-ang opened this issue 1 year ago • 1 comments

Testing of multimodal capable models like Claude 3 and Idefics requires new input nodes that support images to be fed into the prompt nodes.

Apr 12 '24 18:04 massi-ang

Yeah, this is in the roadmap for sure. I don't have much time lately, so, if anyone is reading this and wants to take a shot at it, please do!

To challenges from a Design standpoint:

how to add images as options to existing nodes without cluttering the interface.
how to consider images inside Tabular Data (i.e., can we load images inside spreadsheets? Is there a standard image database format? Can we autodetect local URLs and fetch the image? etc)

One might argue we just create a new node, MultiModalFields or something, but this might just clutter things.

Apr 15 '24 17:04 ianarawjo