npu_plugin
npu_plugin copied to clipboard
Bump onnxruntime from 1.21.1 to 1.22.0 in /.github/actions/compile-models
Bumps onnxruntime from 1.21.1 to 1.22.0.
Release notes
Sourced from onnxruntime's releases.
ONNX Runtime v1.22
Announcements
- This release introduces new API's for Model Editor, Auto EP infrastructure, and AOT Compile
- OnnxRuntime GPU packages require CUDA 12.x , packages built for CUDA 11.x are no longer published.
- The min supported Windows version is now 10.0.19041.
GenAI & Advanced Model Features
- Constrained Decoding: Introduced new capabilities for constrained decoding, offering more control over generative AI model outputs.
Execution & Core Optimizations
Core
- Auto EP Selection Infrastructure: Added foundational infrastructure to enable automatic selection of Execution Providers via selection policies, aiming to simplify configuration and optimize performance. (Pull Request #24430)
- Compile API: Introduced new APIs to support explicit compilation of ONNX models.
- See: OrtCompileApi Struct Reference (Assuming a similar link structure for future documentation)
- See: EP Context Design (Assuming a similar link structure for future documentation)
- Model Editor API api's for creating or editing ONNX models
- See: OrtModelEditorApi
Execution Provider (EP) Updates
CPU EP/MLAS
- KleidiAI Integration: Integrated KleidiAI into ONNX Runtime/MLAS for enhanced performance on Arm architectures.
- MatMulNBits Support: Added support for
MatMulNBits, enabling matrix multiplication with weights quantized to 8 bits.- GroupQueryAttention optimizations and enhancements
OpenVINO EP
- Added support up to OpenVINO 2025.1
- Introduced Intel compiler level optimizations for QDQ models.
- Added support to select Intel devices based on LUID
- Load_config feature improvement to support AUTO, HETERO and MULTI plugin.
- misc bugfixes/optimizations
- For detailed updates, refer to Pull Request #24394: ONNXRuntime OpenVINO - Release 1.22
QNN EP
- SDK Update: Added support for QNN SDK 2.33.2.
- operator updates/support to Sum, Softmax, Upsample, Expand, ScatterND, Einsum
- QNN EP can be built as shared or static library.
- enable QnnGpu backend
- For detailed updates refer to recent QNN tagged PR's
TensorRT EP
- TensorRT Version: Added support for TensorRT 10.9.
- Note for onnx-tensorrt open-source parser users: Please check here for specific requirements (Referencing 1.21 link as a placeholder, this should be updated for 1.22).
- New Features:
- EP option to enable TRT Preview Feature
- Support to load TensorRT V3 plugin
- Bug Fixes:
... (truncated)
Commits
f217402Cherry pick fix for NuGet DML Release package Issue (#24696)6c8097aQnn nuget package update for arm64x (#24690) (#24694)6b0f7c9Revert "Publish debug symbols for windows (#24643) (#24651)" (#24668)8fbc5d7Publish debug symbols for windows (#24643) (#24651)d08403cAdd support for selection policy delegate (based on PR #24653) (#24638)93f85fbCherry pick #24629 (QNN prefer npu) into rel-1.22.0 (#24630)ab9141eCherry pick #24625 into rel-1.22.0 (#24626)d66cff1Cherry-picks into rel-1.22.0 (#24624)cf92d98Cherry-picks into rel-1.22.0 (#24611)ef546e9Cherry-picks into rel-1.22.0 (#24580)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebasewill rebase this PR -
@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it -
@dependabot mergewill merge this PR after your CI passes on it -
@dependabot squash and mergewill squash and merge this PR after your CI passes on it -
@dependabot cancel mergewill cancel a previously requested merge and block automerging -
@dependabot reopenwill reopen this PR if it is closed -
@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency -
@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)