zenithlight

Results 2 comments of zenithlight

It seems like this has been fixed in the source in March. It would be nice if a v2 release could be cut so that we can use it in...

Not sure if this is in scope for what was described, but I have a use case where I want to benchmark the decoding tokens/sec for local LLM API calls....