matteo issues

Results 10 issues of


                                            matteo

AT86RF215 input IQ sync not reached on LVDS interface during TX

I tried sending data via the SMI interface to the cariboulite. By reading the registers on the AT86RF215 modem I see that the TX interface (from rpi to cariboulite) is...

improved kernel module for smi transfers

now it supports cyclic transfers without packet loss

Transmission of arbitrary waveforms

Hello. The series of patches is to enable the TX feature. * fpga firmware fix * software support * test app * examples I had to reduce the fpga fifo...

ggml-cuda: Adding support for unified memory

This adds a environment variable for launching llama.cpp with unified memory on CUDA. This is useful when the model barely fits in VRAM and inference causes OOM errors. In that...

documentation

Review Complexity : Low

Bug: (CUDA) Corrupted output when offloading to multiple GPUs

### What happened? ### Problem Some models produce a corrupted output when offloading to multiple CUDA GPUs. The problem disappears when offloading to a single GPU or using CPU only....

bug-unconfirmed

medium severity

fix wrong template in GLM4-0414

GLM4-0414 models were using the wrong legacy template, leading to a missing [gMASK] preamble. The old code was returning `LLM_CHAT_TEMPLATE_GLMEDGE` As a workaround you needed to launch `llama-server` with `--chat-template...

python

GLM-4-0414 uses the wrong template

### What is the issue? The GLM4 model uses the wrong template, causing a performance degradation. Here is the relevant pull request in llama.cpp: https://github.com/ggml-org/llama.cpp/pull/13099 The reason is that the...

bug

Eval bug: llama-cli, spurious token added to assistant response

### Name and Version version: 5327 (27ebfcac) built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu ### Operating systems Linux ### GGML backends CUDA ### Hardware nvidia ### Models all ###...

bug

Support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client

This PR implements handling additional jinja parameters. Used for example to set enable_thinking in Qwen3 models. The official template is still partially compatible. I modified it to use only supported...

examples

server

Eval bug: llama-cli, Qwen3 jinja template will break CLI multiturn conversation

### Name and Version 611aa914ef4231fab5d1ad04773c42e119ae2d2e ### Operating systems Linux ### GGML backends CUDA ### Hardware NVIDIA ### Models Qwen3 ### Problem description & steps to reproduce I don't know if...

bug-unconfirmed