quantization topic
mmrazor
OpenMMLab Model Compression Toolbox and Benchmark.
Easy-Translate
Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for a...
faster-whisper
Faster Whisper transcription with CTranslate2
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Qbot
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Cha...
AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
minigpt4.cpp
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization