DAN™

Results 24 comments of DAN™

@ggerganov Does this mean llama.cpp could support something like the new GritLM model which can handle both text representations and text generation? I tried the embedding sample with gritlm but...

@ngxson thanks, I used the proper template. I opened an [issue](https://github.com/ggerganov/llama.cpp/issues/5783#issue-2159966946) with a sample program.

I am also getting this error. Any ideas anyone?

Hi there, I recently worked on C# bindings and a basic .NET core project. There are two sample projects included (CLI/Web + API). It could be easily be expanded with...

Same here, this only happens when offloading layers to GPU and running on CPU works fine. Also, I noticed the more GPU layers you have the more gibberish you get.

I tried different models and model sizes, and they all produce gibberish using GPU layers but work fine using CPU. Also, I am compiling from the latest commit on the...

> In llama.cpp line 1158 there should be: > > ``` > vram_scratch = n_batch * MB; > ``` > > Someone that is experiencing the issue please try to...

> Alright. I currently don't have CUDA installed on my Windows partition but I'll go ahead and install it to see if I can reproduce the issue. Thanks, this is...

> Did you revert the change that increases the size of the VRAM scratch buffer? In any case, since the GPU changes that I did are most likely the problem...

> > It is working as intended on my machines which all run Linux. The first step for me to make a fix is to be able to reproduce the...