Tutorial on RAG: Convert fruit vector store failed - GGML_ASSERT: ...... Aborted
Follow the new code in the github, compile success, thanks. Then follow the 'Tutorial on RAG', I use the code on the github to generate 'fruits.dat', and in the next step, i try the cmd:
' ./build/bin/main --embedding_model ./models/bge-m3/bge-m3-q4_1.bin --init_vs ./fruits.dat'
or
' ./build/bin/main --embedding_model ./models/bce-embedding/bce_emb_q8.bin --init_vs ./fruits.dat'
then the error log happened, that is:
'ingesting... GGML_ASSERT: /mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp-master/chatllm.cpp-master/ggml/src/ggml.c:3645: view_src == NULL || data_size == 0 || data_size + view_offs <= ggml_nbytes(view_src) Aborted'
the model file is downloaded from the link https://modelscope.cn/models/judd2024/chatllm_quantized_models/files
So how to solve it for running the Tutorial?
Thanks
fixed by 19f96184f38c5516f1d27be1fd2c81da8d96b04c.
OK, I follow 19f9618 and change the code, .vsdb is generated, then has a 'Segmentation fault' as follow:
cmd: ./build/bin/main --embedding_model ./models/bce-embedding/bce_emb_q8.bin --init_vs ./fruits.dat log: ingesting... 2 / 3 done Vector store saved to: ./fruits.dat.vsdb Segmentation fault
Then I use this .vsdb to run the Retrieving Only pipline, it also has a 'Segmentation fault' as follow, and some 'nan':
cmd: ./build/bin/main --embedding_model ./models/bce-embedding/bce_emb_q8.bin --reranker_model ./models/bce-reranker-base_v1/q8_0.bin --vector_store ./fruits.dat.vsdb +rag_dump
log: {"file": "2.txt"} the orange is green Reference:
- {"file": "2.txt"} timings: prompt eval time = 0.00 ms / 0 tokens ( -nan ms per token, -nan tokens per second) timings: eval time = 0.00 ms / 0 tokens ( -nan ms per token, -nan tokens per second) timings: total time = 0.00 ms / 0 tokens Segmentation fault
It doesn't look like normal and what should we do to solve it?
nan is normal, because nothing is generated by a LLM.
You forget -i, which enables inactive mode.
Follow your step, the error is solved, and log is:
'
....
No LLM is loaded.
Augmented by BCE-Embedding (0.2B) and BCE-ReRanker (0.2B).
You > hello A.I. > {"file": "2.txt"} the orange is green
Reference:
- {"file": "2.txt"}
You > what's fruit is green A.I. > {"file": "2.txt"} the orange is green
Reference:
- {"file": "2.txt"} '
Thanks
By the way, if want to compile the .so or .dll for rag chat.cpp function, how to change the cmake file for this project, this part of compiling will be update for our chatllm project?
Just build so / dll as usual.
ok, I'm trying now, and if i want to offload the ModelObject memory (such as: embedding, rerank), what should be coded for it?
GPU offloading? Please stay tuned.