quantization topic
oreilly-pytorch-dl
Code for Deep Learning for Modern AI
flux-fp8-api
Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
ao
PyTorch native quantization and sparsity for training and inference
Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
RaBitQ
[SIGMOD 2024] RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search
coursera-mlops-specialization
Coursera Machine Learning Engineering for Production Specialization Course
nunchaku
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
ComfyUI-nunchaku
ComfyUI Plugin of Nunchaku
GERM
[ICML 2025] Fast and Low-Cost Genomic Foundation Models via Outlier Removal.
Production-Ready-Instruction-Finetuning-of-Meta-Llama-3.2-3B-Instruct-Project
Instruction Fine-Tuning of Meta Llama 3.2-3B Instruct on Kannada Conversations. Tailoring the model to follow specific instructions in Kannada, enhancing its ability to generate relevant, context-awar...