Chang Xu
Chang Xu
### PR types New features ### PR changes APIs ### Describe add QuantRowParallelLinear and QuantColumnParallelLinear in Quantization; support model parallel QAT
Quant Analyzer provides: 1. compute the sensitivity of the model by quantizing layer by layer; (sensitivity is expressed as model accuracy for now) 2. rank the sensitivity which can show...
add picodet-s-npu full quant demo
1. 手动设置搜索空间(允许block wise等方式搜索) 2. 自动tokenize子网络 3. 支持根据输入token跑特定子网 4. BN和BN2D允许在同一网络结构中,能够转换为对应super op(Det中backbone常使用BN,neck head常使用BN2D) 5. 支持不同layer/block可以搜索不同搜索空间 6. export模型支持group裁剪