mofanke
mofanke
same problem, maybe related to https://github.com/kubeflow/kubeflow/issues/6093.
i workaround by add annotations ``` annotations: notebooks.kubeflow.org/http-rewrite-uri: / ``` and it worked
https://github.com/kubeflow/kubeflow/pull/6450
This question involves the occurrence of probabilistic behavior when a large model's output keeps repeating. and must restart the ollama process to fix
Vulkan can also be used on AMD GPUs. I wonder if the official support for Vulkan is being considered.
> @Picaso2 other than the multimodal models we don't _yet_ support loading multiple models into memory simultaneously. What is the use case you're trying to do? I encountered a similar...
https://coder.com/docs/code-server/latest/FAQ#what-is-the-healthz-endpoint can we use /healthz endpoint for code-server
It will work for private OCI container registry, I've tried it, and it does work.
https://github.com/ollama/ollama/issues/2048 maybe this releated
If I set a specific graphics card id, there will be no error. export CUDA_VISIBLE_DEVICES=0