Programmatically start and load pipelines
Hey,
thank you so much for hayhooks. I really love it.
On the one hand side, I like the idea of deploying pipelines in a very dynamic way, but more or less for testing around and let some devs play with it. On the other hand side I'm a developer and I would like to ship a container image with pipelines that will register via code and all python code is allowed, like relative imports, 3rd party imports etc.
The current haystack solution copies the pipeline_wrapper.py files to some internal runtime and relative imports are not possible.
I've seen https://docs.haystack.deepset.ai/docs/hayhooks#running-programmatically, but I can't see sb. registering pipelines in the code. Seems to be the same workflow, where py files are copied.
I'd appreciate to have something like this:
import hayhooks
from .pipelines import IndexPipeline, QueryPipeline
app = hayhooks()
app.register("index", IndexPipeline)
app.register("query", QueryPipeline)
app.run()
Is it possible?
Thanks a lot
Hi @DaAitch,
I think that this is a very valuable feature request! Currently, the programmatic usage of Hayhooks is more meant e.g. to add on top your own auth middleware or custom some routes. It will always look for pipelines_dir at startup and load pipelines from files.
Adding fully programmatic ops on pipeline wrappers (deploy/undeploy) could be definitely a nice addition.
BTW -for your use case- have you tried to mount a volume in your docker container with the pipeline you need to ship (here's an example) and make Hayhooks point to them using HAYHOOKS_PIPELINES_DIR setting? This way they will be loaded at startup.
Hey @mpangrazzi,
yes I do have a running Docker image with hayhooks.
Due to I need an additional custom component GitlabFetcher, based on other packages python-gitlab I need normal import logic from Python.
In my case, I could solve it using a python project and python image, starting hayhooks, where my to be shared code is also a dependency inside the project. So I can simply do from mypackage import GitlabFetcher.
To wrap it up: as of now, pipelines can never do a relative import, so if you want to share code, you need to roll your own dependency and add it as package.
Future hayhooks may provide an additional mechanism for bootstrapping like what I mentioned in my post or something more composable where both features can coexist:
Starting it like this
hayhooks --bootstrap "entry.py:bootstrap"
Bootstrap hayhooks
# entry.py
from .pipelines import MyPipeline
def bootstrap(hayhooks):
hayhooks.register_pipeline(MyPipeline)
I'm not yet very pythonic, but maybe I'm not too wrong with the idea 😄