ppq issues

混合精度量化支持吗？会结合考虑实际板上推理速度吗？能否推荐一些确实work且好用的论文或方法~？

12

Samples Update & Bug fix

* 添加了许多新的使用说明文件 * 移除了 PPLCUDA_INT4_Quantizer, PPLCUDAMixPrecisionQuantizer以及相关的内容 * 在tqc中添加了属性 require_export 用于后续控制导出逻辑 * 修复了一个子图切分的bug，当block直接连接图的input时可能造成切分错误 * 调整一些程序逻辑使得新的样例可以运行

ZhangZhiPku

New dispatchers

1. update new dispatchers for quantizing all ops in quantable ops, this is used for npu backend 2. update prelu

xiguadong

WIP: ncnn ViT int8

1

## 目的支持 ncnn ViT int8 。WIP。 ## 方案 * 新增 Concat/Add/LayerNorm/mha/Gelu 量化支持 * LayerNorm 用 channel-wise power-of-2 方法 * ncnn 要支持对应 opr 推理 * mha int8 已完成 https://github.com/Tencent/ncnn/pull/3940

tpoisonooo

The time of model inference increases after doing int8 quantization

5

my device is i7-8750H Start Benchmark with openvino (Batchsize = 1) Time span (FP32 MODE): 68.0568 sec Time span (INT8 MODE): 85.6443 sec i don't konw what is happend, how...

tuochi

切到 `08dc0f8b10ecc8f41e52d7a0d4e7b5dc89a92f66` 会报错。 ```bash 2022-06-05 18:02:30,982 - mmdeploy - ERROR - name 'NCNNRequantizePass' is not defined 2022-06-05 18:02:30,982 - mmdeploy - ERROR - onnx2ncnn_quant_table failed. ``` 切 `54c0e3f6f7f469a1a184f54c8c565d93777c6e74` 没事。多加点 CI...

tpoisonooo

核心升级到 0.6.6

1

ZhangZhiPku

Update base.py

打印一下各个optim_pass的信息，感觉比较便于debug

Lenan22

conv 量化误差过大？

2

请问一下我这边有个模型，量化分析的时候误差比较大，这个怎么优化处理？选的TargetPlatform.PPL_CUDA_INT8 默认设置。 ![image](https://user-images.githubusercontent.com/30458099/193197735-54f4a20a-b77c-4180-8212-fbc0e2343490.png)

hufangjian

如何设置engine max batch size？

4

你好，我在跑yolo样例代码时，利用02_Quantization.py代码生成了batchsize 为32的int8量化的tensorrt的engine（基于yolov6s模型），利用04_Benchmark.py进行评估时，报一下错误。 [executionContext.cpp::enqueue::282] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::enqueue::282, condition: batchSize > 0 && batchSize

18242360613

ppq
ppq copied to clipboard

Metadata

混合精度量化支持吗？会结合考虑实际板上推理速度吗？能否推荐一些确实work且好用的论文或方法~？

Samples Update & Bug fix

New dispatchers

WIP: ncnn ViT int8

The time of model inference increases after doing int8 quantization

Bug Report

核心升级到 0.6.6

Update base.py

conv 量化误差过大？

如何设置engine max batch size？

← Metadata

Owner

Metadata

ppq ppq copied to clipboard

Metadata

← Metadata

Owner

Metadata

ppq
ppq copied to clipboard