Alexey Parfenov comments

Results 26 comments of


                                            Alexey Parfenov

Cannot load it with T5 - RTX 5000, Cuda 11.3

I have RTX 3060 and get the same error.

Cannot load it with T5 - RTX 5000, Cuda 11.3

> It seems that your CUDA driver is not detected Yes, after I installed the CUDA Toolkit the error went away (in my case). Thank you!

The results are much worse than with original GPT-J-6B

Not exactly a solution, but there's a dev-version of KoboldAI that allows to split the ML workload between GPU and CPU: https://github.com/henk717/KoboldAI This version works with `hfj`-models that are found...

[Tracker] [bnb] Supporting `device_map` containing GPU and CPU devices

UPDATE (for future readers): the title was changed. --- I think that the title of this issue is a little bit misleading. Technically, a custom `device_map` is already supported for...

[Tracker] [bnb] Supporting `device_map` containing GPU and CPU devices

> If you think this still needs to be addressed please comment on this thread. unstale

Constrained decoding with grammar fails for c4ai-command-r-v01

Can confirm for the `server` too. ```sh curl -Ss --data '{"n_predict":32, "prompt":"Bob: Hi, Alice!\n", "grammar":"root ::= (\"Bob\" | \"Alice\") \":\""}' http://127.0.0.1:8080/completion ``` ``` {"tid":"140147849643840","timestamp":1713292234,"level":"INFO","function":"launch_slot_with_task","line":1037,"msg":"slot is processing task","id_slot":0,"id_task":0} {"tid":"140147849643840","timestamp":1713292234,"level":"INFO","function":"update_slots","line":2066,"msg":"kv cache rm...