quantization topic

List quantization repositories

brocolli

330
Stars
62
Forks
Watchers

Everything in Torch Fx

optimum-intel

338
Stars
93
Forks
Watchers

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

qonnx

100
Stars
32
Forks
Watchers

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

cdma-simulation

28
Stars
5
Forks
Watchers

CDMA communication system using Matlab. #github

llama2gptq

30
Stars
0
Forks
Watchers

Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.

mixtral-offloading

2.3k
Stars
228
Forks
Watchers

Run Mixtral-8x7B models in Colab or consumer desktops

llama.onnx

344
Stars
31
Forks
Watchers

LLaMa/RWKV onnx models, quantization and testcase

tidy

124
Stars
13
Forks
Watchers

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

quantized_neural_networks

20
Stars
5
Forks
Watchers

Python code (packaged in Docker container) to run the experiments in "A Greedy Algorithm for Quantizing Neural Networks" by Eric Lybrand and Rayan Saab (2020).