研究社交

Results 34 comments of 研究社交

Thanks! I am currently working on a new version of the inference runtime which gets rid of waiting time in some CPU-bounded cases, and multi-device inferencing is on its roadmap!

> 这个嘛 前端一个BUG 我今天给修一下。 Apple 的话应该是 Metal 后端,被 Web 前端过滤掉了。

v0.5.14 已经更新前端,请再试一次。

Actually the prefill speed would like to be maxed at about 256. Higher than that does not worth it. It is not a limit on the total token length. It...

I see. There is a limit in the backend for a single request which is 4k. I can remove it anyway.

The limit has been removed.

Thanks! There is a C ffi exists ([here](https://github.com/cryscan/web-rwkv-ffi)). It's not as flexible but is simple to use and extend. Feel free to extend it for your own usage, or reach...

Ah, I make it public now.

> The link https://github.com/cryscan/web-rwkv-ffi seems to be dead (404) Is that helpful to your application?