Jordy Van Landeghem
Jordy Van Landeghem
Any traction on this?
Eagerly awaiting this! Great work @neuralmagic team ;)
Would this mean that you would do "patching" on the embedding space, rather than pixels? Is there currently some hyperparameter that restricts the chunking to a single page?
I tested this PR with a trained QLoRA adapter and I am getting this error: `KeyError: 'lm_head.qweight'` Might this be due to only checking for certain adapter weights? EDIT: no...
@junzhang-zj lol I have exactly the same use case ;p
> > I tested this PR with a trained QLoRA adapter and I am getting this error: `KeyError: 'lm_head.qweight'` > > Might this be due to only checking for certain...
@arianyambao We also suspect issues with Llama-3.1 in vllm, as the scores are far from better than Llama-3. After finetuning it performs even worse...
Can this be given some higher priority? It is an absolute blocker to this set of models...