Jorge Diogo comments

Repositories
Issues
Comments

Results 3 comments of


                                            Jorge Diogo

Add basic JSON Schema support to the API (converts to GBNF grammar)

This would really be good thing and would allow structured extraction libraries like [Sibila](https://github.com/jndiogo/sibila) to integrate Ollama as a provider. Even if the JSON Schema to grammar converter has limitations...

Models are failing to be properly unloaded and freeing up VRAM

In my experience, @jkawamoto approach is a good one, because it frees RAM/CUDA/other memory, even if the Llama object is stuck. I've tried calling ```del llama_model```, but this is not...

Segmentation fault

For local models, Sibila depends on [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), which is the only component that can issue a Segmentation fault. Try using the latest version of llama-cpp-python and make sure it's using...