4onen comments

Results 13 comments of


                                            4onen

Parsing URLs with IPv6 host

Appears to be a bug. Line 167 of url/Url.elm indexes the first `:` in the host of the URL and, if it detects more than one `:`, assumes the URL...

llama-bench : Add `--override-tensors` arg

Sketchy performance comparison on my laptop to show why `--override-tensors` helps MoE models. I set longer context lengths than the standard `llama-bench` to emphasize why keeping the attention operations on...

llama-bench : Add `--override-tensors` arg

PR #12891 has resolved my issue running flash attention and override-tensors with Deepseek-V2-Lite. Some performance numbers for that, same hardware as my last set: **CPU Only** (Used 0.8GB of VRAM...

llama-bench : Add `--override-tensors` arg

Ran another set of experiments on another device (RTX 3070 and an AMD Ryzen 7 5800X 8-Core with two sticks of 2133MHz DDR4) **CPU Only** (Used 836MB of VRAM during...

llama-bench : Add `--override-tensors` arg

Got it @slaren. As for splitting the test grid entries, would you prefer that I use semicolons instead of commas the same way that we do for tensor split? Or...

llama-bench : Add `--override-tensors` arg

I've implemented the behaviour the same way as tensor-split, for now. That is, `;` is now the internal separator for different overrides and `,` is now the separator between test...

llama-bench : Add `--override-tensors` arg

I understand now why all of the other functions in that file were marked `static`. I'll see if I can get my linux desktop up and make sure I run...

llama-bench : Add `--override-tensors` arg

All I can say is the CPU CI ran to completion on my Ubuntu 22.04 machine with no errors I was aware of. I'll try to take a look at...

llama-bench : Add `--override-tensors` arg

Tried the Vulkan CI (because I can't run the CUDA CI on my desktop with my nvcc, apparently) and that failed on an unused parameter in a file my change...

llama-bench : Add `--override-tensors` arg

Adding `CMAKE_CUDA_ARCHITECTURES=86` (for the 3070 in my desktop) resulted in the same message. It's possible that my driver and NVCC CUDA versions are desynced, as `nvidia-smi` reports CUDA version 12.7....