Matan Kleyman
Matan Kleyman
### Feature request Curerntly, when we run "openllm build --backend pt" the building process downloads the model and builds a bento. However, once we afterwards run "openllm build --backend vllm"...
Hi, Thanks for this amazing repo, I successfully trained a model on custom data and achieved great results. But when i try training on 2 gpu's using --strategy==gpus i met...
**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Currently it is only...
Would be great to add examples how to use Declarai with llama index or other data retrieval frameworks to enable data capabilities.
1. Implement HuggingfaceLLM that inherits from BaseLLM 2. Implement HuggingfaceOperator that inherits from BaseOperator
It would be great to declare that a function returns a Literal["str1", "str2"] or EnumCls. This particularary useful when I want to get return type of a subset of specific...
Currently the middleware is implementing , before & after side effects executions. In streaming, what happens is that only after the full streaming response is iterated, only then we execute...
@LukeForeverYoung Hey! Thanks for sharing this amazing work! Are the model weights and inference code available ? I would be happy to test them locally.