glide
glide copied to clipboard
🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps
- Render all durations as human readable strings - Adds tests - Updates docs
Use ObjectPools to optimize memory allocation of - chat request/response schema allocations - chat stream chunk schema allocations
Add a new config to control logging of incoming request message + message history along with LLM response message. ## Use Case - Request messages and responses may contain sensitive...
Use air https://github.com/cosmtrek/air to hot reload Glide on codebase modifications.
Just like Glide supports text-to-text, embeddings & speech unified API, let's try to bring a support for zero/few-shot classification. Cohere supports this case explicitly. Other providers like OpenAI/Anthropic don't. In...
Design authentication and authorization for the Glide Gateway, so that: - Glide works out of the box for simple use cases - Glide can leverage existing external user management systems...
https://github.com/zilliztech/GPTCache/blob/main/docs/usage.md#Use-GPTCache-server
## Requirements - Mask PHI, PII, and PCI information in lang/embedding requests - Unmask all masked values before returning LLM chat responses back - Unmask masked tokens in streaming chat...
Support incoming request compression natively in Glide to optimize LLM cost: - https://github.com/microsoft/LLMLingua
To be groomed ## Use Cases - I want to have a way to block some "harmful" requests, so they are not sent to LLM providers. Instead, I want to...