ggllm.cpp issues

Metal support

2

ggml and llama.cpp support [Metal](https://github.com/ggerganov/llama.cpp/pull/1642), do Apple Silicon users need to use LLaMA.cpp or can they use gglm.cpp with Falcon?

ghost

#1 performance requirement

I'm stuck with other work, I recently pushed half finished branch, containing a ton of fixes and changes but not finished. Also moved from falcon_main to "ggfalcon" which is meant...

cmp-nct

Upcoming PR - Pushing the Context limit to 8k+ for all existing Falcon models - Longrange Falcon flights

5

I plan to PR today, though it depends on final progress. The computation speed is slow because we currently have no mulmat kernel with interleaving broadcast support yet, so tests...

cmp-nct

enhancement

finished

Performance - heads up

2

Just a heads up, given it's more than a week since last release. I'm deep in a complete overhaul of a series of behavior and functions. The core focus is...

cmp-nct

enhancement

refactor gpt_params_parse, add validate_params function

The parsing of the parameters is broken. Basically, parameters were only parsed correctly to arguments if it was the last argument of the command line. For most parameter errors in...

maddes8cht

--help , pipes and inconsistent help text

7

I have always been irritated (also in Llama.cpp) by the fact that the help text cannot be piped. Neither a `falcon-main --help |less` (the help is meanwhile 85 lines long)...

maddes8cht

finished

linking error with static build

2

# Prerequisites - [X] I am running the latest code. Development is very rapid so there are no tagged versions as of now. - [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)....

WilliamTambellini

ggllm.cpp
ggllm.cpp copied to clipboard

Metadata

Metal support

#1 performance requirement

Upcoming PR - Pushing the Context limit to 8k+ for all existing Falcon models - Longrange Falcon flights

Performance - heads up

refactor gpt_params_parse, add validate_params function

--help , pipes and inconsistent help text

linking error with static build

Unable to run TheBloke Falcon40b-instruct

Add support for AMX instructions (bf16 and/or int8)

Log the version of cuda that is being used

← Metadata

Owner

Metadata

ggllm.cpp ggllm.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

ggllm.cpp
ggllm.cpp copied to clipboard