feat(compression): update tooling to use DECODE operators

Open rkuester opened this issue 2 months ago • 1 comments

Summary

Update the compression tooling to produce models that use DECODE operators instead of the metadata-based compression scheme.

The interpreter already supports DECODE operators for decompression. The previous tooling stored compression metadata in the flatbuffer, requiring the interpreter to detect compressed tensors and requiring each kernel to implement decompression internally. The updated tooling inserts explicit DECODE custom operators into the graph, handling decompression before kernels execute and eliminating per-kernel decompression logic.

Additionally, refactor the tooling to use a plugin architecture, enabling multiple compression methods (LUT, Huffman, Pruning) to be implemented independently.

Changes

Replace model_facade with model_editor for TFLite model manipulation
Add DECODE operator insertion logic for compressed tensors
Add Compressor protocol for compression plugins
Implement LUT compression as a plugin (Huffman and Pruning are stubs)
Add integration tests verifying compressed models produce correct inference results through the TFLM interpreter

Testing

All changes include unit tests. Integration tests run with --//:with_compression and verify:

Compressed model outputs match uncompressed
DECODE operators are inserted
Compressed models are smaller than originals

Nov 25 '25 04:11 rkuester