Alexander Rogachev
Alexander Rogachev
After the "default" model was converted into ONNX format its inference speed on GPU decreased nearly 4x times. Are there any solution or updates related to onnx inference?
### 🐛 Describe the bug Trying to convert https://github.com/GXYM/TextBPN-Plus-Plus into ONNX format. I take initial model from https://github.com/GXYM/TextBPN-Plus-Plus/blob/main/network/textnet.py ``` torch.onnx.export(model, dummy_input, "TextBPM_dyn.onnx", export_params=True, opset_version=16, do_constant_folding=True, input_names = ['modelInput'], output_names =...
Original paper clamed to use big batches. Using current implementation I face the problem that if I increase the batch size even to 1024, it fails on the **second** iteration....
As llama3-llava-next-8b and LLaVA-NeXT-Video-7B-DPO seem to have the same interface, is it possible to make llama3-llava-next-8b process multiple frames of one video per single forward? Basically, I don't get the...