Move streaming to StreamCallbacks.jl
I would like to propose moving streaming functionality to a separate package (made by me): https://github.com/svilupp/StreamCallbacks.jl
Why?
- provides an implementation that can be shared across other providers (Anthropic, OpenAI, etc)
- Implements more of the SSE "standard" (eg, includes event names, does proper splitting, catches message overflow between "masterchunks", etc)
- Allows easy extensibility for users (to choose the sink, logging, etc.)
Downsides
- Opinionated
- Extra-dependency (lightweight though)
- By default, it builds the standard Response afterwards (to allow standard downstream functionality as well), which might be wasteful for some users/workflows (but the cost and time are sooo tiny compared to the generation time)
I've been using it in PromptingTools.jl for a few releases already and it seems to work well.
Bump here.
Is there any interest in refactoring the functionality with a new dep? The current implementation is not correct and will lead to more errors.
I would say it's probably better to separate them out -- I'd be happy to see StreamingCallbacks.jl implemented here. As to this point:
By default, it builds the standard Response afterwards (to allow standard downstream functionality as well), which might be wasteful for some users/workflows (but the cost and time are sooo tiny compared to the generation time)
Not a concern on my end, if anything that strikes me as pretty ergonomic.
If the current implementation is not correct and will lead to errors, I say full steam ahead.
I can't seem to find an easy way to stream outputs. Are users meant to manually parse the raw "data: {...}" JSON vibe?
While tinkering, I found myself using
get_delta(x) = contains('{')(x) ? OpenAI.JSON3.read(x[6:end])[:choices][1]["delta"][:content] : ""
which at least works for OpenRouter, and then:
streamcallback = x -> print(get_delta(x))
within the create_chat call.
StreamCallbacks.jl integrates into PromptingTools.jl so conveniently, and it would be nice to have something similar here, or at least make the current streamcallback usage more robust and less annoying to deal with. It seems the Devin attempt to integrate it here kinda failed, after I tested that branch myself.