deepsparse
deepsparse copied to clipboard
Extractor dfs performance
Description
This change modifies the DFS search used in model extraction to use sets rather than iterables
Motivation and Context
While shape inference is the most significant bottleneck, these changes are a step in the direction of being able to support model extraction for very large graphs.
Test Script
import onnx
from onnx.utils import Extractor
model = onnx.load("obertquant.onnx")
extractor = Extractor(model)
extracted_model = extractor.extract_model(
input_names=["input_ids", "attention_mask", "token_type_ids"], output_names=["2058"]
)
onnx.save(extracted_model, "truncated.onnx")
Benchmarks were produced using pyinstrument and analyzing the Extractor.extract_model function
| Model Name | Num Nodes | Previous | New |
|---|---|---|---|
| obertquant.onnx | 1271 | 0.158s | 0.110s |
| ai-town-3B.onnx | 3515 | 8.002s | 3.725s |
Related: https://github.com/onnx/onnx/pull/6213
Per the main README announcement, DeepSparse is being deprecated by June 2, 2025. Closing the PR as work has been suspended; thank you for the inputs and support!