Peter Kim
Results
4
repositories owned by
Peter Kim
flash-attention-minimal
511
Stars
42
Forks
Watchers
Flash Attention in ~100 lines of CUDA (forward pass only)
mixed-precision-from-scratch
16
Stars
0
Forks
Watchers
Mixed precision training from scratch with Tensors and CUDA
paged-attention-minimal
17
Stars
1
Forks
Watchers
a minimal cache manager for PagedAttention, on top of llama3.