Tian Xia
Tian Xia
bumping for this @Michaelvll @romilbhardwaj - are there any changes I need to make for this PR? IIUC it will automatically skip the autostop for k8s controller?
I'm wondering if putting `ports` section in `task.yaml` is a good idea, since in #1850 and AWS's proposed implementation in #1487 , the ports will open for all VMs launched...
> @cblmemo can you post collect_env from your working PyTorch-2.1 installation? > > Also can you please try running 2.2 with LD_LIBRARY_PATH defined to some bening value would resolve the...
> @cblmemo Could you please provide how CUDA/CUDNN was installed on your machine ? Are you running it on Docker mage that is publically available ? So we can try...
> This is a known issue and also reported [here](https://discuss.pytorch.org/t/could-not-load-library-libcudnn-cnn-train-so-8-in-new-version/190818). @malfet narrowed it down already with us to a `dlopen` call inside `libcudnn` preferring the system-wide libs over the ones...
@Michaelvll This is ready for a look now 🫡 I'm still running smoke tests and adding a new streaming test for now, will report back later
All skyserve smoke test passed 🫡
TODO: - [x] investigate side effects for max num connections = 1000 - [x] test if an abortion happens on the client side, if the worker will stop generate -...
Tested for abortion and it works as well. Use the following script to launch LB & worker, and `http://0.0.0.0:7000/`, then Ctrl+C the `curl` command. The logging for `===========WORKER` will stop....
Just tested with a [modified version of fastchat](https://github.com/cblmemo/fschat-print-streaming/tree/print-stream) and the abortion works well. I uses the OpenAI Client [here](https://github.com/skypilot-org/skypilot/blob/master/examples/serve/llama2/chat.py) and manually Ctrl+C to abort the request. YAML i used: ```yaml...