Gilhyeon Lee
Gilhyeon Lee
I'm trying to convert my llama2 7b model following Readme In STEP1, I set my command like below, `python3 /home/ghlee/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /home/ghlee/llama/llama-2-7b --model_size 7B --output_dir /home/ghlee/llama.onnx/onnx_converted` and this kind of...
Hello, Could you please advise me on how to disable the KV cache? I would also appreciate any guidance on how to implement this change in code. Thank you for...
### System Info ```shell python 3.10.14 torch 2.4.0+cu121 optimum 1.21.4 onnx 1.16.2 onnxruntime 1.19.0 transformers 4.43.4 optimum-cli export onnx --model distilbert/distilbert-base-uncased-distilled-squad distilbert_base_uncased_squad_onnx/ when I try to do simple example above,...