Jason (Siyu) Zhu
Jason (Siyu) Zhu
Earlier, there was an awesome PR https://github.com/vllm-project/vllm/pull/916 on supporting the GPTQ Exllama kernel in a 4-bit quantization setup. This PR introduces additional kernels for use cases with different quantization bits,...
# Summary This PR introduces a new model handler [openfunctions_handler.py](https://github.com/ShishirPatil/gorilla/compare/main...JasonZhu1313:gorilla:jaszhu/add_openfunctions_handler?expand=1#diff-3af430d47eb913aec657f3bad6dcbae4e39ee152dcb8b1699e65614fdd87e10d) to run inference on OS model gorilla-llm/gorilla-openfunctions-v2 and reproduce the results on leaderboard Issue: https://github.com/ShishirPatil/gorilla/issues/352 # Changes * Merge the...
**Describe the bug** A clear and concise description of what the bug is. Great work on gorilla! I have used the OS model checkpoint https://huggingface.co/gorilla-llm/gorilla-openfunctions-v2 with vLLM to try reproducing...
Hey, Great observations and work on disentangling the format following from reasoning! Could we share details on evaluation dataset we used and how we can reproduce the result in the...
# What does this PR do? Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to HF Trainer with optional flag Fixes # [(issue) ](https://github.com/huggingface/transformers/issues/32861) ## Before submitting - [x] This PR...
**Is the feature request related to a problem?** No, OpenAI recently released new feature to support structured output with constrained decoding, would love to see the standing on the leaderboard...
### Feature request Agentic RL Support in GPT-OSS ### Motivation Hey Community, @HJSang and I are from the LinkedIn Core AI team. Over the past few weeks, we’ve been working...