Roman Koshkin issues

Repositories
Issues
Comments

Results 4 issues of


                                            Roman Koshkin

Doesn't work with m-a-p/OpenCodeInterpreter-DS-33B

### Describe the bug I'm using OI with local models as follows: ```bash interpreter \ -y \ --api_base http:://localhost:8000 \ --model openai/gpt-3.5 \ --context_window 4096 \ --api_key=not_needed \ --local ```...

bug

Llama-3 support

### Feature request I tried to run LLama-3 on TGI (1.3). The model kind of works, but it doesn't stop at the EOS tokens. I suspect TGI doesn't "understand" Llama-3's...

Response priming (option to provide the initial part of the assistant's message in the API request)

### Feature request A method to prime the response of the model. It can be done by either removing the **assistant**'s closing tag from the template if the last message...

Does `gpt-fast` work on V100 GPUs?

Everything works on my A6000s and A100s, but not on the older V100 (says compute capability is low). Are there plans to add support for the legacy devices? Thanks!