quantization topic
hicolor
🎨 Convert images to 15/16-bit RGB color with dithering
rwkv.cpp
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
gptq_for_langchain
A guide about how to use GPTQ models with langchain
BabyGPT
Something in the middle of Karpathy's mingpt model and video lectures, BabyGPT is an easy to use model on a much smaller scale (16 and 256 out channels , 5 heads, fine tuned). To be made useful on l...
Lightweight-Low-Resource-NMT
Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.
autoencoder_based_image_compression
Autoencoder based image compression: can the learning be quantization independent? https://arxiv.org/abs/1802.09371
image-optimizer
Optimize any image by chroma subsampling and optimized huffman coding in Python. Basically, using JPEG algorithm!
takeoff-community
TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models accessible to everyone.
PB-LLM
PB-LLM: Partially Binarized Large Language Models
awesome-approximate-dnn
Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment