llmaz issues

Customized flags for backendRuntimes

2

**What would you like to be added**: Right now, we have at most two inferenceModes in backendRuntime, one is Default, another is SpeculativeDecoding, what if people wants to customized there...

kerthcet

feature

needs-priority

needs-triage

Benchmark toolkit support

7

**What would you like to be added**: It would be super great to support benchmarking the LLM throughputs or latencies with different backends. **Why is this needed**: Provide proofs for...

kerthcet

help wanted

feature

needs-priority

needs-triage

Support traditional models

1

**What would you like to be added**: Right now, llmaz is mostly designed for large language models, however, some users may need to support traditional models as a singleton solution,...

kerthcet

feature

needs-priority

needs-triage

Support scaling with Spot instances for cost saving

9

**What would you like to be added**: See https://github.com/spotinst as an example, which means we should support multi cloud providers. **Why is this needed**: Cost saving for users. **Completion requirements**:...

kerthcet

feature

needs-priority

needs-triage

Accelerate model loading

2

**What would you like to be added**: Generally, - if user use object stores, they can use fluid as distributed caching system - if user use oci images, they can...

kerthcet

feature

important-soon

needs-triage

Support filesystems

1

**What would you like to be added**: Support filesystems with the uri protocol as `pvc://`, this is compatible with distributed cache systems like fluid in the future. **Why is this...

kerthcet

feature

needs-priority

needs-triage

Will sharing models via hostPath leading to security probelm

11

At the first glance, because the Models are published by the admins, it maybe ok because the data source is under supervising. Or is that user need?

kerthcet

question

needs-priority

needs-triage

ollama support

2

**What would you like to be added**: ollama provides [sdk](https://github.com/ollama/ollama-python) for integrations, we can easily integrate with it, one of the benefits I can think of is ollama maintains a...

kerthcet

feature

needs-priority

needs-triage

Prompts managements

1

**What would you like to be added**: For inference scenarios, prompts management is an important part of it. **Why is this needed**: Easy to use for inference users. **Completion requirements**:...

kerthcet

feature

needs-priority

needs-triage

Support OCI artifacts

4

xref: https://github.com/InftyAI/llmaz/issues/20

kerthcet

enhancement

feature

needs-priority

needs-triage

llmaz
llmaz copied to clipboard

Metadata

Customized flags for backendRuntimes

Benchmark toolkit support

Support traditional models

Support scaling with Spot instances for cost saving

Accelerate model loading

Support filesystems

Will sharing models via hostPath leading to security probelm

ollama support

Prompts managements

Support OCI artifacts

← Metadata

Owner

Metadata

llmaz llmaz copied to clipboard

Metadata

← Metadata

Owner

Metadata

llmaz
llmaz copied to clipboard