llm-rs-python icon indicating copy to clipboard operation
llm-rs-python copied to clipboard

Is streaming supported with langchain AsyncIteratorCallbackHandler?

Open AdrianLsk opened this issue 2 years ago • 1 comments

I am getting no reponses when using with langchain callback AsyncIteratorCallbackHandler?

It gives only this warning

RuntimeWarning: coroutine 'AsyncCallbackManagerForLLMRun.on_llm_new_token' was never awaited
  run_manager.on_llm_new_token(chunk, verbose=self.verbose)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

AdrianLsk avatar Aug 19 '23 23:08 AdrianLsk

I don't think the current langchain wrapper support async calls, but it shouldn't be to hard to add, as the model.stream() call already unlocks the GIL internally while generating tokens. But you would have to ensure that the model is never used in parallel as that would probably create some sort of memory access problems or simply crash if you offloaded your model onto a gpu.

Do you know what function needs to be implemented by langchains LLM class to enable async processing?

LLukas22 avatar Aug 20 '23 09:08 LLukas22