DAN™
DAN™
@ggerganov Does this mean llama.cpp could support something like the new GritLM model which can handle both text representations and text generation? I tried the embedding sample with gritlm but...
@ngxson thanks, I used the proper template. I opened an [issue](https://github.com/ggerganov/llama.cpp/issues/5783#issue-2159966946) with a sample program.
I am also getting this error. Any ideas anyone?
Hi there, I recently worked on C# bindings and a basic .NET core project. There are two sample projects included (CLI/Web + API). It could be easily be expanded with...
Same here, this only happens when offloading layers to GPU and running on CPU works fine. Also, I noticed the more GPU layers you have the more gibberish you get.
I tried different models and model sizes, and they all produce gibberish using GPU layers but work fine using CPU. Also, I am compiling from the latest commit on the...
> In llama.cpp line 1158 there should be: > > ``` > vram_scratch = n_batch * MB; > ``` > > Someone that is experiencing the issue please try to...
> Alright. I currently don't have CUDA installed on my Windows partition but I'll go ahead and install it to see if I can reproduce the issue. Thanks, this is...
> Did you revert the change that increases the size of the VRAM scratch buffer? In any case, since the GPU changes that I did are most likely the problem...
> > It is working as intended on my machines which all run Linux. The first step for me to make a fix is to be able to reproduce the...