Burkhard Ringlein
Results
2
issues of
Burkhard Ringlein
### Motivation In our experiments and applications, the triton autotuner is key to achieve competitive or best performance (e.g. for [flash attention in vLLM](https://github.com/vllm-project/vllm/issues/5083)). Also, we learned that for more...
## Purpose very experimental and draft PR so far ## Test Plan ``` VLLM_ATTENTION_BAKCEND=EXPERIMENTAL_HELION_ATTN vllm serve meta-llama/Llama-3.1-8B-Instruct ``` ## Test Result t.b.a. --- Essential Elements of an Effective PR Description...
rocm
kernel
v1