Kelei Jiang
Kelei Jiang
``` python build.py --model_dir /workspace/qllama-7b-chat \ --dtype float16 \ --remove_input_padding \ --use_gpt_attention_plugin float16 \ --enable_context_fmha \ --use_gemm_plugin float16 \ --output_dir "/tmp/new_lora_7b/trt_engines/fp16/2-gpu/" \ --max_batch_size 1 \ --max_input_len 512 \ --max_output_len 50...
Thank you, but when is it expected to support loading multiple Lora weights?
Do you have plans to support Ascend 910B in the future?
感谢对国产化的支持!
https://github.com/om-ai-lab/OmDet/blob/main/omdet/omdet_v2_turbo/infer_model.py#L66 Thank you very much for your issue. 1. The exported onnx model does not include image preprocessing and NMS post-processing. 2. And because onnx requires the input to be...