Aaron Miller
Aaron Miller
Colors are currently inconsistent across different machines (or OS configuration?) ideally put these all in one place in the qml instead of spread out everywhere  
Re-do of [661](https://github.com/nomic-ai/gpt4all/pull/661) - leaving as Draft until building/linking issues are solved Improves output quality by making these tokenizers more closely match the behavior of the huggingface `tokenizers` based BPE...
### Feature request A straightforward ggml implementation of a model allocates enough memory to hold the activations from *all the model layers at once.* Outside of explicitly asking it to,...
https://github.com/ggerganov/ggml/issues/217 adapted from gpt-neox example and work started in https://github.com/ggerganov/llama.cpp/issues/1602 only supports 7b right now - 40b multiquery attention gets hairier, as its 128 query heads with 8 k and...
noticed this and initially thought it was a difference between q4_k and q4_0, but its just that smaller models require higher `ctx-size` to break - it appears to be poorly...
This is mostly to fix https://github.com/allusion-app/Allusion/issues/448 but also speeds thumbnail generation up quite a bit - I know this was [tried before](https://github.com/allusion-app/Allusion/pull/365) and ran into build issues. I looked for...