chatllm.cpp Tutorial on RAG: Convert fruit vector store failed

Follow the new code in the github, compile success, thanks. Then follow the 'Tutorial on RAG', I use the code on the github to generate 'fruits.dat', and in the next step, i try the cmd:

' ./build/bin/main --embedding_model ./models/bge-m3/bge-m3-q4_1.bin --init_vs ./fruits.dat'

or

' ./build/bin/main --embedding_model ./models/bce-embedding/bce_emb_q8.bin --init_vs ./fruits.dat'

then the error log happened, that is:

'ingesting... GGML_ASSERT: /mnt/d/Codes/RAG/new_pipline/cpp_project/chatllm.cpp/chatllm.cpp-master/chatllm.cpp-master/ggml/src/ggml.c:3645: view_src == NULL || data_size == 0 || data_size + view_offs <= ggml_nbytes(view_src) Aborted'

the model file is downloaded from the link https://modelscope.cn/models/judd2024/chatllm_quantized_models/files

So how to solve it for running the Tutorial?

Thanks

Sep 19 '24 02:09 GentleYo

fixed by 19f96184f38c5516f1d27be1fd2c81da8d96b04c.

Sep 19 '24 06:09 foldl

OK, I follow 19f9618 and change the code, .vsdb is generated, then has a 'Segmentation fault' as follow:

cmd: ./build/bin/main --embedding_model ./models/bce-embedding/bce_emb_q8.bin --init_vs ./fruits.dat log: ingesting... 2 / 3 done Vector store saved to: ./fruits.dat.vsdb Segmentation fault

Then I use this .vsdb to run the Retrieving Only pipline, it also has a 'Segmentation fault' as follow, and some 'nan':

cmd: ./build/bin/main --embedding_model ./models/bce-embedding/bce_emb_q8.bin --reranker_model ./models/bce-reranker-base_v1/q8_0.bin --vector_store ./fruits.dat.vsdb +rag_dump

log: {"file": "2.txt"} the orange is green Reference:

{"file": "2.txt"} timings: prompt eval time = 0.00 ms / 0 tokens ( -nan ms per token, -nan tokens per second) timings: eval time = 0.00 ms / 0 tokens ( -nan ms per token, -nan tokens per second) timings: total time = 0.00 ms / 0 tokens Segmentation fault

It doesn't look like normal and what should we do to solve it?

Sep 19 '24 07:09 GentleYo

nan is normal, because nothing is generated by a LLM.

You forget -i, which enables inactive mode.

Sep 20 '24 02:09 foldl

Follow your step, the error is solved, and log is:

' .... No LLM is loaded.
Augmented by BCE-Embedding (0.2B) and BCE-ReRanker (0.2B).

You > hello A.I. > {"file": "2.txt"} the orange is green

Reference:

{"file": "2.txt"}

You > what's fruit is green A.I. > {"file": "2.txt"} the orange is green

Reference:

{"file": "2.txt"} '

Thanks

Sep 20 '24 02:09 GentleYo

By the way, if want to compile the .so or .dll for rag chat.cpp function, how to change the cmake file for this project, this part of compiling will be update for our chatllm project?

Sep 20 '24 02:09 GentleYo

Just build so / dll as usual.

Sep 20 '24 02:09 foldl

ok, I'm trying now, and if i want to offload the ModelObject memory (such as: embedding, rerank), what should be coded for it?

Sep 24 '24 09:09 GentleYo

GPU offloading? Please stay tuned.

Sep 26 '24 00:09 foldl

Tutorial on RAG: Convert fruit vector store failed - GGML_ASSERT: ...... Aborted