llama3.java
llama3.java copied to clipboard
Added llama.cpp-compatible http-server.
This PR contains a llama.cpp-compatible HTTP-server using the HTTP-server of the JDK. In my tests I used the HTML- and JS-resources of llama.cpp.
I wrote this implementation a few weeks ago so it may not be mergeable directly.
I have added an OpenAI-compatible Chat Completions API (with streaming) along with a HTTP session. However, the code is now more extensive (not using DTOs/JAX-RS, but also without dependencies).