Aaron Miller issues

Results 6 issues of


                                            Aaron Miller

All UI colors should be hardcoded

Colors are currently inconsistent across different machines (or OS configuration?) ideally put these all in one place in the qml instead of spread out everywhere ![image](https://user-images.githubusercontent.com/169252/233662482-d5638931-d4af-45db-bad0-4e235360bebd.png) ![image](https://user-images.githubusercontent.com/169252/233662566-fe9fde2f-5c9b-48d5-ae6a-54cdf93e8147.png)

redo: New tokenizer implementation for MPT and GPT-J

Re-do of [661](https://github.com/nomic-ai/gpt4all/pull/661) - leaving as Draft until building/linking issues are solved Improves output quality by making these tokenizers more closely match the behavior of the huggingface `tokenizers` based BPE...

Use ggml scratch buffers in non-llama models

### Feature request A straightforward ggml implementation of a model allocates enough memory to hold the activations from *all the model layers at once.* Outside of explicitly asking it to,...

add falcon7b example

https://github.com/ggerganov/ggml/issues/217 adapted from gpt-neox example and work started in https://github.com/ggerganov/llama.cpp/issues/1602 only supports 7b right now - 40b multiquery attention gets hairier, as its 128 query heads with 8 k and...

Output on Metal is silently corrupted when out of memory

noticed this and initially thought it was a difference between q4_k and q4_0, but its just that smaller models require higher `ctx-size` to break - it appears to be poorly...

WIP: Use 'sharp' for thumbnail generation

This is mostly to fix https://github.com/allusion-app/Allusion/issues/448 but also speeds thumbnail generation up quite a bit - I know this was [tried before](https://github.com/allusion-app/Allusion/pull/365) and ran into build issues. I looked for...