david-vectorflow
david-vectorflow
Does this mean that the faster tokenizer should be working? If I run this code below, it confirms that its the fast one but then it throws a warning that...
In case anyone else finds this, here is a sample of working batch inference code based on the link above ```python prompt_temp = "system\nAnswer the questions.user\n\n{}assistant\n" prompts=[] images = []...
@asadnhasan are you still planning on working on this?
Any update?
Attempting to downgrade to vllm 0.3.3 causing the installation of the most recent version of sglang to hang indefinitely
Thanks! Looks like it still isn't solved though?
Does it maintain the aspect ratio when it coverts?