Michael Royzen

Results 38 comments of Michael Royzen

@jcwchen the model is [ahotrod/albert_xxlargev1_squad2_512](https://huggingface.co/ahotrod/albert_xxlargev1_squad2_512). The odd thing is it really isn't that big! Weights are ~900MB.

@jcwchen I'm running the conversion using this script: https://github.com/huggingface/optimum/blob/main/examples/onnxruntime/optimization/question-answering/run_qa.py with the following args: > python run_qa.py \ --model_name_or_path ahotrod/albert_xxlargev1_squad2_512 \ --dataset_name squad_v2 \ --optimization_level 99 \ --do_eval \ --output_dir /home/ubuntu/albert_xxlargev1_squad2_512_onnx_optimized...

Did it really work for you @jcwchen? I re-ran the script without `--optimize_for_gpu` and with `CPUExecutionProvider` and it gave the same ValueError as above. I tried setting all_tensors_to_one_file=False, but it...

@jcwchen Setting `all_tensors_to_one_file=False` still leads to: > Traceback (most recent call last): File "examples/onnxruntime/optimization/question-answering/run_qa.py", line 524, in main() File "examples/onnxruntime/optimization/question-answering/run_qa.py", line 311, in main optimizer.export( File "/opt/conda/lib/python3.8/site-packages/optimum/onnxruntime/optimization.py", line 149, in...

`convert_attribute=True`did not work: > Traceback (most recent call last): > File "examples/onnxruntime/optimization/question-answering/run_qa.py", line 524, in > main() > File "examples/onnxruntime/optimization/question-answering/run_qa.py", line 311, in main > optimizer.export( > File "/opt/conda/lib/python3.8/site-packages/optimum/onnxruntime/optimization.py", line...

@HSQ79815 could you link to a pull request please? I just tried with the [latest onnx weekly build ](https://test.pypi.org/project/onnx-weekly/) and got the same error when trying to export a large...

I temporarily solved the issue by using the 8bfe4f58a4cbdf84348a37838ba61c980bc6c101 commit of transformer-deploy and using PyTorch 1.11 as well as `onnx==1.12.0` and `onnxruntime==1.12.0`.

Any updates @nvpohanh? KV-cache for this implementation would be a game changer for us.

Checking in again @nvpohanh. Do you have any ETA for KV-cache support?

Yes, adding `use_external_data_format=True` to run_qa didn't work unfortunately @JingyaHuang -- it seems the issue is with ONNX. This is what I tried in run_qa.py: `optimizer.export( onnx_model_path=model_path, onnx_optimized_model_output_path=optimized_model_path, optimization_config=optimization_config, **use_external_data_format=True** )`