trustgraph icon indicating copy to clipboard operation
trustgraph copied to clipboard

ChunkEmbeddings still referenced in 0.19.0-0.19.2 resulting in failure to start embedding container

Open toliver38 opened this issue 1 year ago • 3 comments

just configured a new setup for 0.91.1 and attempted to spin up the containers.

All containers seem to start up without issues but trustgraph-store-doc-embeddings-1

It appears to hit an error with Milvus and use of ChunkEmbeddings.

Traceback (most recent call last):
  File "/usr/local/bin/de-write-milvus", line 3, in <module>
    from trustgraph.storage.doc_embeddings.milvus import run
  File "/usr/local/lib/python3.12/site-packages/trustgraph/storage/doc_embeddings/milvus/init.py", line 2, in <module>
    from . write import *
  File "/usr/local/lib/python3.12/site-packages/trustgraph/storage/doc_embeddings/milvus/write.py", line 6, in <module>
    from .... schema import ChunkEmbeddings
ImportError: cannot import name 'ChunkEmbeddings' from 'trustgraph.schema' (/usr/local/lib/python3.12/site-packages/trustgraph/schema/init.py). Did you mean: 'GraphEmbeddings'?

It appears the references to ChunkEmbeddings were replaced but maybe they didn't get replaced in the doc_embedding code

#235 appears to show some of the code changes impacting this going back to 0.19.0

toliver38 avatar Jan 02 '25 14:01 toliver38

Same error with qdrant on 0.19.3

File "/usr/local/bin/de-write-qdrant", line 3, in from trustgraph.storage.doc_embeddings.qdrant import run File "/usr/local/lib/python3.12/site-packages/trustgraph/storage/doc_embeddings/qdrant/init.py", line 2, in from . write import * File "/usr/local/lib/python3.12/site-packages/trustgraph/storage/doc_embeddings/qdrant/write.py", line 11, in from .... schema import ChunkEmbeddings ImportError: cannot import name 'ChunkEmbeddings' from 'trustgraph.schema' (/usr/local/lib/python3.12/site-packages/trustgraph/schema/init.py). Did you mean: 'GraphEmbeddings'?

vbaz avatar Jan 03 '25 07:01 vbaz

@vbaz I've been informed by @cybermaggedon that the refactoring taking place for 0.19.x included a move to GraphEmbeddings so any of the doc embeddings will fail.

Easiest workaround at this stage is

  1. (ignore as you are using qdrant) Don't use Milvus - this code hasn't been completely refactored as of 0.19.2
  2. Remove any container references to store-doc-embeddings container. This will not run and will just consume resources until it fails 100 restarts.

These two should allow you to get the systems up and running.

https://discord.com/channels/1251652173201149994/1324374806514241609 - Discord discussion if you need

toliver38 avatar Jan 03 '25 09:01 toliver38

https://github.com/trustgraph-ai/trustgraph/issues/245 has (hopefully) fixed this, but there's a ton more testing to do.

At time of writing, this is fixed for the Qdrant / Cassandra combination.

  • The de-write- and de-query- processes work and sit there idle.
  • There is no document RAG pipeline unless the document-rag component is deployed. There will be an option in the Config tool at some point soon
  • With document-rag deployed the de-processes do the write thing and offer a Document RAG pipeline

cybermaggedon avatar Jan 05 '25 18:01 cybermaggedon