Feat: multiple endpoints using a list of LitServer
Before submitting
- [x] Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
⚠️ How does this PR impact the user? ⚠️
As a user, I want to host multiple endpoints for different purposes, such as serving an embedding API, prediction API, etc., on the same server while maintaining LitServer features.
What does this PR do?
Fixes #271.
- This PR introduces a feature that allows running multiple LitServer instances in a combined form, as discussed in issue #271.
Usage
# server.py
from litserve.server import LitServer, run_all
from litserve.test_examples import SimpleLitAPI
class SimpleLitAPI1(SimpleLitAPI):
def setup(self, device):
self.model = lambda x: x**1
class SimpleLitAPI2(SimpleLitAPI):
def setup(self, device):
self.model = lambda x: x**2
class SimpleLitAPI3(SimpleLitAPI):
def setup(self, device):
self.model = lambda x: x**3
class SimpleLitAPI4(SimpleLitAPI):
def setup(self, device):
self.model = lambda x: x**4
if __name__ == "__main__":
server1 = LitServer(SimpleLitAPI1(), api_path="/predict-1")
server2 = LitServer(SimpleLitAPI2(), api_path="/predict-2")
server3 = LitServer(SimpleLitAPI3(), api_path="/predict-3")
server4 = LitServer(SimpleLitAPI4(), api_path="/predict-4")
run_all([server1, server2, server3, server4], port=8000)
# client.py
import requests
for i in range(1, 5):
resp = requests.post(f"http://127.0.0.1:8000/predict-{i}", json={"input": 4.0}, headers=None)
assert resp.status_code == 200, f"Expected response to be 200 but got {resp.status_code}"
assert resp.json() == {"output": 4.0**i}, f"Expected response to be {4.0**i} but got {resp.json()}"
PR review
Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃
Codecov Report
Attention: Patch coverage is 98.43750% with 1 line in your changes missing coverage. Please review.
Project coverage is 95%. Comparing base (
f475369) to head (e5db967).
Additional details and impacted files
@@ Coverage Diff @@
## main #276 +/- ##
===================================
Coverage 95% 95%
===================================
Files 14 14
Lines 1082 1143 +61
===================================
+ Hits 1025 1085 +60
- Misses 57 58 +1
@bhimrazy wow, love the api. nice job!
side question, shouldn’t the port also be tied to each server? not in the run_all function?
ie: i want a server on 8000 another on 8001? cc @lantiga
@bhimrazy wow, love the api. nice job!
side question, shouldn’t the port also be tied to each server? not in the run_all function?
ie: i want a server on 8000 another on 8001? cc @lantiga
Thanks, @williamFalcon! 🙌
Regarding the question: Yes, if you're referring to hosting each LitServer on separate ports.
In the current case (#276), the routes from each LitServer are combined, and the combined single app is hosted. This way it allows all the routes to be accessed from the same port.
(Maybe, we should consider renaming the
run_allfunction to better reflect this use case.)
However, I think to host each LitServer on individual ports, we could still prefer the default method, but have to run them separately.:
server1 = LitServer(SimpleLitAPI1(), api_path="/predict-1")
server1.run(port=8000)
----
server2 = LitServer(SimpleLitAPI2(), api_path="/predict-2")
server2.run(port=8001)
Hope this clarifies things!
still can't use this function
In the meantime , this gets merged , is there any other way to run multiple endpoints in one main server ?
this is paused and not scheduled to be merged until we have a very clear usecase.
so, the best way to unblock this is to share code of what you are trying to do and why you wouldn’t just run two separate servers on the same machine?
@VikramxD
hi! do you know when this PR will be merged? in my case, multiple machine learning engineers add their embedding or prediction models in one codebase, then this main litserve model is run in K8S
thanks
hi! do you know when this PR will be merged? in my case, multiple machine learning engineers add their embedding or prediction models in one codebase, then this main litserve model is run in K8S
thanks
Hi @raulcarlomagno,
Not yet, unfortunately.
If this feature is crucial for your use case, you can consider using this snippet from this PR. It should help you achieve the desired functionality for now.
thumbs up for this piece of art 👍
Closing this PR for now. Issue #271 will be addressed soon in a separate PR. There is an ongoing discussion around it, and since this PR is quite old and the implementation might take a different approach, it's better to start fresh with a new PR.