llm-analysis icon indicating copy to clipboard operation
llm-analysis copied to clipboard

[REQUEST] Implement modern attention schemes such as GQA or MLA

Open brunorigal opened this issue 4 months ago • 0 comments

Thanks for this very interesting library,

I did not see any specific implementations for Grouped Query Attention or Multi Head Latent attention, which seem to be very popular these days.

brunorigal avatar Oct 02 '25 09:10 brunorigal