Tung D. Le issues

Results 23 issues of


                                            Tung D. Le

Support different data types rather than float32 in RunONNXModel.py

`RunONNXModel.py` script is able to generate random data input for a model. However, it uses float32 as data type by default. Generating random data should check the input types and...

Use OGE instead of OLT for ONNXReluOp

This patch uses `greater than or equal to zero` instead of `less than zero` in the lowering of ONNXReluOp. Mathematically, they are the same, but in practice it has caused...

Handle a big value for the maximum trip count in ONNX.loop

There is segfault when running a model whose ONNX Loop’s trip count is a big value. Below are snippets of generated code when compiling the model: At ONNX level (`onnx-mlir...

Lower onnx.transpose to memref.transpose for better performance

Given that the current implementation of onnx.transpose by actually shuffling data is expensive, it is better to use `memref.transpose` that just doest metadata changes. This is description of[ memref.transpose](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td#L1738): ```...

Passing int32 parameters to external function calls at the LLVM level causes wrong generated code on zLinux

Note: This error does not happen with the current master branch of onnx-mlir which is currently passing int64 instead of int32 parameters to external functional calls. This issue seems to...

[RFC] Constant propagation and folding using DenseResourceElementsAttr

In onnx-mlir, we have `--constprop-onnx` pass that is used to do constant propagation, e.g. when all inputs of an operation are constants, the pass will replace the operation by a...

Test cases for the ONNX importer

I have sometimes encountered errors related to importing ONNX operators into ONNX dialect, e.g. #217. This happened when working with a new ONNX model. It is better to have test...

Segfault when running inference with the IBM granite.20B model (the one with KV cache) at the number of input tokens of at least 1265

Segfault when running inference with the IBM granite.20B model (the one with KV cache) at the number of input tokens of at least 1265. Segfault does not happen if the...

Get onnx-mlir version that was used to compile a .so file by using python

Currently we can get the onnx-mlir version directly from .so file by running `readelf -p .comment`: ``` $ readelf -p .comment lstm.so String dump of section '.comment': [ 0] GCC:...

enhancement

Follow-up items for PR2321 (--store-constants-to-file)

Below are the follow-up items to support `--store-constants-to-file` in PR #2321. - Support memory-mapping on Windows - Add a metadata in the constant file to indicate endianness - Improve the...