quantization topic
KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
TFMQ-DM
[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
KGySoft.Drawing.Tools
Debugger visualizers and image editor apps built on KGy SOFT Drawing Libraries
fast_yolov7_pytorch
Using pruning and quantization algorithm to accelerate your yolov7's inference.
pratical-llms
A collection of hand on notebook for LLMs practitioner
BitNetMCU
Neural Networks with low bit weights on low end 32 bit microcontrollers such as the CH32V003 RISC-V Microcontroller and others
quantizr
Fast library for converting RGBA images to 8-bit palette images. Written in Rust; can be used in C programs
transformer_bcq
BCQ tutorial for transformers
picollm
On-device LLM Inference Powered by X-Bit Quantization
MI-optimize
mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniqu...