SwiftInfer
SwiftInfer copied to clipboard

hpcaitech

→

Metadata

Efficient AI Inference & Serving

Reame
Issues

Results 3 SwiftInfer issues

Sort by recently updated

qwen1.5二次微调后的模型可以适配吗？

Hello, I would like to ask, based on qwen1.5-32k originally supports 32k, if I train it, will the input length become weaker? Is it okay to use yours? Does it...

yawzhe

是否可以提供一个镜像？大家用起来也更加方便

1

lwbmowgli

Does SwiftInfer integrate well with Page Attention ?

Page Attention is a widely used method for llm serving. It splits the KVCache of a request into multiple blocks and each block contains multiple slots (tokens). I think that...

gawainx

About

Efficient AI Inference & Serving

artificial-intelligence

deep-learning

inference

gpt

llama

llm-serving

llama2

llm-inference

452

Stars

25

Forks

Watchers

Owner

hpcaitech

← Metadata

452

Stars

25

Forks

Watchers

Owner

hpcaitech

Metadata

Efficient AI Inference & Serving

Back

SwiftInfer SwiftInfer copied to clipboard

Metadata

qwen1.5二次微调后的模型可以适配吗？

是否可以提供一个镜像？大家用起来也更加方便

Does SwiftInfer integrate well with Page Attention ?

← Metadata

Owner

Metadata

SwiftInfer
SwiftInfer copied to clipboard