[Feature] Support `response_format` for `TurboMind`

Open h4n0 opened this issue 1 year ago • 4 comments

Motivation

I'm using TurboMind engine and I got an error while requesting response_format with json_schema. Code here: https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/serve/openai/api_server.py#L367

Is there any plan to support this for TurboMind?

Related resources

if request.response_format and request.response_format.type != 'text':
    if VariableInterface.async_engine.backend != 'pytorch':
        return create_error_response(
            HTTPStatus.BAD_REQUEST,
            'only pytorch backend can use response_format now')
    response_format = request.response_format.model_dump()

Additional context

No response

Nov 13 '24 15:11 h4n0

Yes. We'll support it in December. Stay tuned.

Nov 21 '24 02:11 lvhan028

Recently, we have been busy addressing the needs from our internal team. This feature won't be tackled until we finish them. Sorry for that.

Nov 26 '24 03:11 lvhan028

Hey, any updates on this issue?

May 13 '25 14:05 Johnno1011

still waiting for updates for this issue, thanks.

Jul 26 '25 13:07 oldnetdog