versatile-data-kit Deploy a RAG pipeline to a VDK cluster

What is the feature request? What problem does it solve? A rag pipeline will be built as part of https://github.com/vmware/versatile-data-kit/issues/3147.

The goal of this ticket is to get that pipeline rewritten in a VDK data job. Only a single data job. use VDK for the postgres url etc...

Feb 21 '24 13:02 murphp15

Which VDK cluster? We can use the SuperCollider . In the CICD there's no UI.

Feb 21 '24 14:02 antoniivanov

@antoniivanov it would be brilliant to have a UI. But will we be able to build big images on their cluster?

is there anyway you can think of to get the UI working quickly on the CICD cluster?

Feb 21 '24 14:02 murphp15

It is fine to use SuperCollider for this.

We can install teh sentence-transfomers dependency dynamically (and remove it from requirements.txt)

https://github.com/vmware/versatile-data-kit/issues/3142#issuecomment-1956762676

Feb 22 '24 11:02 antoniivanov

But that will make the vdk look so weak!

Feb 22 '24 11:02 murphp15