llmaz icon indicating copy to clipboard operation
llmaz copied to clipboard

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

Results 82 llmaz issues
Sort by recently updated
recently updated
newest added

**What would you like to be added**: Right now, we have at most two inferenceModes in backendRuntime, one is Default, another is SpeculativeDecoding, what if people wants to customized there...

feature
needs-priority
needs-triage

**What would you like to be added**: It would be super great to support benchmarking the LLM throughputs or latencies with different backends. **Why is this needed**: Provide proofs for...

help wanted
feature
needs-priority
needs-triage

**What would you like to be added**: Right now, llmaz is mostly designed for large language models, however, some users may need to support traditional models as a singleton solution,...

feature
needs-priority
needs-triage

**What would you like to be added**: See https://github.com/spotinst as an example, which means we should support multi cloud providers. **Why is this needed**: Cost saving for users. **Completion requirements**:...

feature
needs-priority
needs-triage

**What would you like to be added**: Generally, - if user use object stores, they can use fluid as distributed caching system - if user use oci images, they can...

feature
important-soon
needs-triage

**What would you like to be added**: Support filesystems with the uri protocol as `pvc://`, this is compatible with distributed cache systems like fluid in the future. **Why is this...

feature
needs-priority
needs-triage

At the first glance, because the Models are published by the admins, it maybe ok because the data source is under supervising. Or is that user need?

question
needs-priority
needs-triage

**What would you like to be added**: ollama provides [sdk](https://github.com/ollama/ollama-python) for integrations, we can easily integrate with it, one of the benefits I can think of is ollama maintains a...

feature
needs-priority
needs-triage

**What would you like to be added**: For inference scenarios, prompts management is an important part of it. **Why is this needed**: Easy to use for inference users. **Completion requirements**:...

feature
needs-priority
needs-triage

xref: https://github.com/InftyAI/llmaz/issues/20

enhancement
feature
needs-priority
needs-triage