Daniel Barker comments

Results 16 comments of


                                            Daniel Barker

Support for Optimum Inference?

I'm not sure how easy it would be to integrate this into TEI, but have you looked into the [Nvidia Triton Inference Server](https://developer.nvidia.com/triton-inference-server)? It does require some legwork to get...

Reliably serve unique and plausible text

For visibility, [here is a link to the repo](https://github.com/ramblingjordan/AbBOT-api) that implements an API for the option you outlined in (1).

Support TEI on AMD GPUs

> Hello? Is there any WiP regarding ROCm support for TEI? Thanks! I'm interested in this as well. May start working on it myself, but don't want to duplicate efforts...

Support TEI on AMD GPUs

Currently using MI250s.

Cannot load microsoft/Phi-3-medium and microsoft/Phi-3-small with TGI-2.0.4

> was able to reproduce and it seems like the phi model's config changed slightly and this caused TGI to load the weights incorrectly for ref the num of heads...

Cannot load microsoft/Phi-3-medium and microsoft/Phi-3-small with TGI-2.0.4

> I'm getting the same issue as you. No matter what request I send I only get garbage responses with the text "Vlad" "sten" "cu" and other random words. Unfortunately...