[Question] Python instead of Ollama
Maybe it's weird, but I want to use python code instead of ollama. I have this code, ans is guaranteed to contain a string with the answer. But when I run the code and write "hello" or something similar to void, the request comes. 3 times the request comes and gets a response, but void writes "Error Void: Response from model was empty."
My code: from flask import Flask, request, jsonify, make_response import requests import json
app = Flask(name)
@app.route('/v1/chat/completions', methods=['POST']) def local_llm_emulator(): ans = "" # I skipped the part where I get the answer into the ans variable response = { "id": "chatcmpl-123", "object": "chat.completion", "created": 123456, "model": "felo", "choices": [ { "index": 0, "message": { "role": "assistant", "content": ans } } ] } response = make_response(response) response.content_type = "application/json" return make_response(response)
@app.route('/api/tags', methods=['GET']) def api_tags(): return jsonify({"tags": []})
if name == 'main': app.run(host='127.0.0.1', port=11434)
LOG:
- Serving Flask app 'm'
- Debug mode: off WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
- Running on http://127.0.0.1:11434 Press CTRL+C to quit 127.0.0.1 - - [17/May/2025 00:29:23] "POST /v1/chat/completions HTTP/1.1" 200 - 127.0.0.1 - - [17/May/2025 00:29:28] "POST /v1/chat/completions HTTP/1.1" 200 - 127.0.0.1 - - [17/May/2025 00:29:33] "POST /v1/chat/completions HTTP/1.1" 200 -
And after that i give "Error Void: Response from model was empty."
Can someone point out my mistake? Sorry for such a stupid issue.
Curl:
curl http://localhost:11434/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "llama2",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}'
Answer:
{"choices":[{"index":0,"message":{"content":"Hello! How can I assist you today?","role":"assistant"}}],"created":123456,"id":"chatcmpl-123","model":"v1","object":"chat.completion"}
UPD: I modified my code and added curl output
i don't think i understand . maybe the llm goes out of memory . that's the downside of using tools like void . you should better invest how llm works behind the scene .