Sicheng Stephen Jia
Sicheng Stephen Jia
Summary: ## Context By default, storage buffers in Vulkan must contain 32 bit data types; using 8 bit and 16 bit data types in buffers can be enabled optionally by...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #5844
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #5799 * #5831 * #5830 ## Context As title, this diff adds an implementation for a fused SDPA + KV-Cache update operator...
### Name and Version ```shell $ build-vk/bin/llama-cli --version ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = NVIDIA PG509-210 (NVIDIA) | uma: 0 | fp16: 1 | bf16: 0 | warp...
Original commit message: Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #15829 * #15796 * #15795 * #15794 * #15793 Title says it all! Adds `int32` and `uint8` shader variants...