Daniel Barker
Daniel Barker
### Feature request HuggingFace has a nice API for serving local LLMs in their [`text-generation`](https://github.com/huggingface/text-generation-inference) repo. I'd like to have a wrapper for this implemented in `langchain.llms`. **Resolves Issues** *...
# Added Feature for HF text-generation LLM wrapper Basically just adds a feature (and simple tests) to allow usage of the HuggingFace [`text-generation`](https://github.com/huggingface/text-generation-inference) LLM inference API. It works really well...
# Added support for streaming output response to HuggingFaceTextgenInference LLM class Current implementation does not support streaming output. Updated to incorporate this feature. Tagging @agola11 for visibility. Fixes [Issue #4631](https://github.com/hwchase17/langchain/issues/4631)...
### Feature request Per title, request is to add feature for streaming output response, something like this: ```python from langchain.llms.huggingface_text_gen_inference import HuggingFaceTextGenInference from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = HuggingFaceTextGenInference( inference_server_url='http://localhost:8010',...
Thanks for all your work this project. I was able to get it running in a docker container the other day (using the `dev_camera` branch), and I'm really impressed with...