Ying Sheng

Results 16 issues of Ying Sheng

The current frontend using OpenAI will invoke multiple calls for the example below: ``` @sgl.function def example(s): s += "Construct a character." s += "Name: " + gen("name") + "...

enhancement
high priority

## Function Calling - Frontend - Add `tools` argument in `sgl.gen`. See also guidance [tools](https://github.com/guidance-ai/guidance/blob/d1bbe1c698cbb201f89556d71193993e78c0686b/README.md?plain=1#L102) - Backend - OpenAI: Translate to their function calling API (https://platform.openai.com/docs/guides/function-calling). - Local Models (SGLang)...

List some good use cases of SGLang here: - [SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures](https://arxiv.org/pdf/2402.03620.pdf) - [Tractable Control for Autoregressive Language Generation](https://starai.cs.ucla.edu/papers/ZhangICML23.pdf)

- [ ] Trace API call statistics (number of calls, number of tokens for each call) in OpenAI backend. - [ ] Trace API price in OpenAI backend. The goal...

inactive

Mainly, make the `_execute_gen` simpler by moving out the speculative execution part as a new function. https://github.com/sgl-project/sglang/blob/5b647543c141a6b21307f3fbc679d2a0a9231c41/python/sglang/lang/interpreter.py#L424