Christian Konrath comments

Results 7 comments of


                                            Christian Konrath

Multi-GPU (AMD) Performance Regression in Ollama with ROCm 6.3.1

**Here it is:** [ollama.log](https://github.com/user-attachments/files/18609359/ollama.log) ``` root@ki: ollama ps NAME ID SIZE PROCESSOR UNTIL qwen2.5-coder:32b-instruct-q8_0 f37bbf27ec01 54 GB 100% GPU 8 seconds from now root@ki: rocm-smi ========================= ROCm System Management Interface...

Multi-GPU (AMD) Performance Regression in Ollama with ROCm 6.3.1

I removed the second and third GPU from my system and now it is running stable. With only one GPU, my qwen2.5-coder:32b-instruct-q8_0 produces about 4 tokens per second - even...

Multi-GPU (AMD) Performance Regression in Ollama with ROCm 6.3.1

Hi all, Just wanted to give a quick update: I reconnected all **three AMD 7900XTX GPUs on my Ubuntu 24.04 minimal** setup (_just ran the Ollama install script, no additional...

Multi-GPU (AMD) Performance Regression in Ollama with ROCm 6.3.1

> How many tokens can you get when running deepseek-R1:70b-llama-distill_q8_0 using three 7900xtx? 70 billion parameters are too much for 72GB of VRAM, which makes it slow, with CPU/GPU utilization...

[BUG] Context Loss Bug in v2.1.3/2.1.4

This is NOT a duplicate! **vs #17453:** - #17453: Terminal UI becomes slow/laggy during typing and execution - **This issue:** Complete memory corruption - ALL context data shows 0 tokens...

[BUG] Context Loss Bug in v2.1.3/2.1.4

Hi @KindEmily Thanks for the detailed debug logs and for confirming this is still happening in 2.1.5! Your observation about the "Summarizing all X messages (~0 tokens)" entries is really...

[BUG] Context Loss Bug in v2.1.3/2.1.4

Update: v2.1.12 seems to have fixed this issue. Can confirm it's working stable now.