ChristineSeven comments

Results 21 comments of


                                            ChristineSeven

can not test with restful_api

The same issue, do not know where is wrong.

@mrwyattii yes, this solved! But when I do requests, another issue cames. Would you help to check this? Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python3.8/threading.py", line...

can not test with restful_api

> > @mrwyattii yes, this solved! But when I do requests, another issue cames. Would you help to check this? Exception in thread Thread-1: Traceback (most recent call last): File...

can not test with restful_api

the server code is like this: ``` python client = mii.serve( "mistralai/Mistral-7B-v0.1", deployment_name="mistral-deployment", enable_restful_api=True, restful_api_port=28080, ) ```

memory leak in v0.2.7

same issue

memory leak in v0.2.7

Whether I use the CUDA graph or not, there is memory issue.

memory leak in v0.2.7

> > I also have problems with a memory leak with vllm 0.2.7. For me it's not limited to Ray but also concerns the API server itself, no matter whether...

memory leak in v0.2.7

Not only in version 0.2.7, I tested 0.2.6, 0.2.3 also has this issue. The reproduce step is start the server and stay for serveral hours, then the issue comes. @zhuohan123...

llama inference test

use your code, i got this error, module 'lightseq.inference' has no attribute 'Llama' . could you tell how you bypass this? @HandH1998