Results 16 comments of Daniel Barker

I'm not sure how easy it would be to integrate this into TEI, but have you looked into the [Nvidia Triton Inference Server](https://developer.nvidia.com/triton-inference-server)? It does require some legwork to get...

For visibility, [here is a link to the repo](https://github.com/ramblingjordan/AbBOT-api) that implements an API for the option you outlined in (1).

> Hello? Is there any WiP regarding ROCm support for TEI? Thanks! I'm interested in this as well. May start working on it myself, but don't want to duplicate efforts...

Currently using MI250s.

> was able to reproduce and it seems like the phi model's config changed slightly and this caused TGI to load the weights incorrectly for ref the num of heads...

> I'm getting the same issue as you. No matter what request I send I only get garbage responses with the text "Vlad" "sten" "cu" and other random words. Unfortunately...