aphrodite-engine icon indicating copy to clipboard operation
aphrodite-engine copied to clipboard

[Feature]: xtc sampling support for kai api

Open BlairSadewitz opened this issue 1 year ago • 4 comments

🚀 The feature, motivation and pitch

I use the koboldai webui a lot. Koboldcpp supports xtc (and dry rep pen, and maybe some other samplers) that I don't think this does. You do have xtc in the openai api, though. I'd just use the webui with the openai api endpoint, but it doesn't seem to do streaming. Am I doing something wrong?

Thanks.

Alternatives

No response

Additional context

No response

BlairSadewitz avatar Sep 25 '24 15:09 BlairSadewitz

I'll enable it very soon, thanks for reminding!

AlpinDale avatar Sep 26 '24 14:09 AlpinDale

Thanks. BTW, take a gander at vllm pr 8713 ([build] enable existing pytorch (for GH200, aarch64, nightly) #8713 ). As they note, this is really handy to have for GH200/arm64 and other situations when there's no pytorch release build available. Sure beats having to wrangle the build system into dealing with it myself, at least, lol.

BlairSadewitz avatar Sep 26 '24 16:09 BlairSadewitz

@BlairSadewitz I was looking to adding this, but I found out it's already enabled. Can you check again?

AlpinDale avatar Sep 26 '24 22:09 AlpinDale

@BlairSadewitz I was looking to adding this, but I found out it's already enabled. Can you check again?

Hey, didn't see that you'd responded. Yeah, I'll check again, heh.

OH, BTW, what do you think about creating a python package with the fast_hadamard_transform stuff in it? It's quite possible there is actually no point in doing this (because different packages use different revisions of the code, and maybe it would break too often, i really dunno), heh, but various other packages, e.g. QQQ and EETQ (at least), both build that. Then when I build those, I could just edit setup.py to use that one. QQQ installs a package named that outright, whereas with EETQ its builtin. Kind of a mess.

BlairSadewitz avatar Oct 14 '24 22:10 BlairSadewitz