hau

Results 24 comments of hau

the problem is that the repo itself isn't a valid nextjs app? i have to use the command

sweet will give it a shot On Wed, Mar 15, 2023 at 9:21 AM, aratic < ***@***.*** > wrote: > > >> >> >> Python Bindings for llama.cpp: https:/ /...

awesome! where does the model get held in memory? i have a modern GPU but the inference is still not real-time for me

this is going to break the fucking internet

This part of the code in particular needs some work I think

These errors are just the result of

Any updates here?

Thanks for linking! I'm excited. The main concern I have is for speculative decoding is that latency improvements bounded by the size of the model. Since exllama only seems to...