virentakia
virentakia
Loading a model via the CLI using the following model file, and the inference speed and output is exactly as expected: ``` FROM solar-10.7b-instruct-q8_0.gguf TEMPLATE """### System: {{ .System }}...
Great work on the project, really excited to see the outcomes. However, After running the script below, the pruned model (output) seems to be of the same size as the...
Amazing work and fantastic resource, thanks for sharing your work - this should jump start usage of llm on low resource devices. Quick question - is there a guide to...