Andreas Kieslinger
Andreas Kieslinger
I created some logs and checked how often `cuda_graph_update_required` is set to `true`. Out of 200 checks, only 9 set the variable to `true`. However even with `cuda_graph_update_required=false`, a lot...
Hi slaren, can you share your hardware setup and how you ran these tests? Additionally, how many runs have you done per configuration? This is the first time I've seen...
@Nexesenex @IMbackK @AbdullahMPrograms for some free speed up targeting the same area of this PR, use `LLAMA_SET_ROWS` as detailed in #14482. Be assured that reducing CPU overhead still is an...
The issue appears on windows, too.