Yuwen Zhou
Yuwen Zhou
## Type of Change example ## Description update ONNXRT example for new API JIRA ticket: [ILITV-2468](https://jira.devtools.intel.com/browse/ILITV-2468) ## How has this PR been tested? extension test on onnx models ## Dependency...
Signed-off-by: yuwenzho ## Type of Change docstring ## Description model /base_model.py model /model.py model /nets_factory.py model /onnx_model.py model /torch_model.py model /__init__.py ## Dependency Change? no
# What does this PR do? 1. support fp8 kv cache 2. support reuse kv cache
## Type of Change example API changed or not: no ## Description Update 3.x torch example and enhance 3.x common logger information. Smooth quant uses quantize(), others use prepare() +...
## Type of Change bug fix API changed or not: no ## Description Update lm-eval evaluate in ort llm example ## How has this PR been tested? extention test ##...
## Type of Change feature API changed or not: no ## Description Use different WeightOnlyLinear module according to device. - Abstract WeightOnlyLinear class. Inherited class INCWeightOnlyLinear and HPUWeighOnlyLinear - Load...