Why does it always prompt when starting: "ValueError: The server of local inference endpoints is not running, please start it first. "
I got the exact same issue.... For me, it is caused by the python models_server.py --config configs/config.default.yaml keeping getting killed without any error. Any help will be appreciated.

Doesn't the startup service execute models_server.py first and then awesome_chat.py?
I got the exact same issue.... For me, it is caused by the
python models_server.py --config configs/config.default.yamlkeeping getting killed without any error. Any help will be appreciated.
That happens if you run out of memory. Check your activity monitor.
That happens if you run out of memory. Check your activity monitor.
Are you sure about this? I get the same error even though I configure for hybrid/minimal as suggested and have 24GB free space. README says it would consume >12GB but, I don't think that it would exceed 24GB, right?