Daniel Bammer comments

Repositories
Issues
Comments

Results 1 comments of


                                            Daniel Bammer

[Question] Fine-tune Gemma3 OOM

I can confirm this behavior on gemma-3-4b-it for every attention implementation: sdpa, eager and flash_attention_2 Llama.cpp only recently patched gemma-3 attention to need less vram.