Roman Koshkin

Results 4 issues of Roman Koshkin

### Describe the bug I'm using OI with local models as follows: ```bash interpreter \ -y \ --api_base http:://localhost:8000 \ --model openai/gpt-3.5 \ --context_window 4096 \ --api_key=not_needed \ --local ```...

bug

### Feature request I tried to run LLama-3 on TGI (1.3). The model kind of works, but it doesn't stop at the EOS tokens. I suspect TGI doesn't "understand" Llama-3's...

### Feature request A method to prime the response of the model. It can be done by either removing the **assistant**'s closing tag from the template if the last message...

Everything works on my A6000s and A100s, but not on the older V100 (says compute capability is low). Are there plans to add support for the legacy devices? Thanks!