LLamaSharp When using StatelessExecutor, llama_new_context logs to console every time InferAsync is called

llama_new_context_with_model: n_ctx = 7168 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: offloading v cache to GPU llama_kv_cache_init: offloading k cache to GPU llama_kv_cache_init: VRAM kv self = 8680.00 MiB llama_new_context_with_model: kv self size = 8680.00 MiB llama_build_graph: non-view tensors processed: 1430/1430 llama_new_context_with_model: compute buffer total size = 78.56 MiB llama_new_context_with_model: VRAM scratch buffer: 75.50 MiB llama_new_context_with_model: total VRAM used: 22148.73 MiB (model: 13393.23 MiB, context: 8755.50 MiB)

With the other executors, these logs only appear when the model is loaded. With StatelessExecutor they are output every time InferAsync is called. It seems to ignore the ILogger passed into the constructor, as passing NullLogger.lnstance has no effect on this behavior.

Dec 13 '23 23:12 andymartin

The reason seems to be that every time calling InferAsync, there's a context loaded. Thank you for reporting us this BUG, we'll fix it soon.

Dec 14 '23 10:12 SanftMonster

Newbie question.. Is there a way to prevent LlamaSharp/llama.ccp from logging these values to Console? I have read the docs but I wasnt able to answer the question myself.

Dec 21 '23 04:12 chatbuildcontact

I believe you can do something like this:

NativeApi.llama_log_set((level, message) =>
{
    // This will be called when llama.cpp wants to log a message. Do whatever you like!
    Console.WriteLine($"[{level}]: {message}");
});

Dec 21 '23 14:12 martindevans

I believe you can do something like this:

NativeApi.llama_log_set((level, message) =>
{
    // This will be called when llama.cpp wants to log a message. Do whatever you like!
    Console.WriteLine($"[{level}]: {message}");
});

Thanks for that pointer!

Dec 22 '23 14:12 chatbuildcontact

we'll fix it soon.

What's the planned fix here? As far as I'm aware we want to create a new context every time, so that it's truly stateless. Or should we try to re-use the context, but clean up all the state between inferences?

Jan 07 '24 17:01 martindevans

Hello I just found this amazing library and am adapting it to my needs. I also have to create a new context every time and would like to remove the output logs.

I would use something like this, as pointed out earlier in this thread: NativeLogConfig.llama_log_set(NullLogger.Instance);

I use llava model (vision), and I can still see the output every time I create a new context:

encode_image_with_clip: 4 segments encoded in 700.78 ms encode_image_with_clip: image embedding created: 2304 tokens

encode_image_with_clip: image encoded in 729.47 ms by CLIP ( 0.32 ms per image patch) llava_eval_image_embed : failed to eval

I would love to switch it off somehow

Aug 02 '24 15:08 micoraweb