Johannes Vass

Results 29 comments of Johannes Vass

I just read about SGLang's approach for constrained decoding. Did you consider adding that to VLLM instead of Outlines? See for example this blog article: https://lmsys.org/blog/2024-02-05-compressed-fsm/

I also have problems with a memory leak with vllm 0.2.7. For me it's not limited to Ray but also concerns the API server itself, no matter whether I use...

For now my workaround is to set a memory limit and restart vllm automatically after OOM.

> I have a similar question that might be related to it. I see that it's not possible (at least via GUI) to remove files at document set/connector level (hence,...

Which exact settings of the `GEN_AI_` variables did you try? For me the following works with a self-hosted Huggingface TGI: ``` GEN_AI_MODEL_VERSION="" GEN_AI_MODEL_PROVIDER="huggingface" HUGGINGFACE_API_BASE="https://xyz" GEN_AI_API_ENDPOINT="https://xyz" ``` Disclaimer: I am unsure...

@mad-mikey can you already foresee when you will be able to contribute this?

There is already another issue regarding this: #984

``` In [1]: import DeepInstruments Using TensorFlow backend. --------------------------------------------------------------------------- ImportError Traceback (most recent call last) in () ----> 1 import DeepInstruments /Users/johannesvass/ownCloud/Studium/2017S_Bachelorarbeit/ismir2016/DeepInstruments/__init__.py in () 51 import DeepInstruments.audio 52 import DeepInstruments.descriptors...

### Problem Analysis The issue seems to be a breaking change in the `tokenizers` library (probably https://github.com/huggingface/tokenizers/pull/1476) which prevents an XLM-Roberta tokenizer saved with a version >= `0.19.0` to be...