MLflow output codec doesn't follow the model's metadata

Open JRial95 opened this issue 7 months ago • 1 comments

I'm trying to execute the following MLflow model with MLServer:

from typing import Any
from mlflow.pyfunc import PythonModel, PythonModelContext

class MyModel(PythonModel):
    def predict(
        self,
        context: PythonModelContext | None,
        model_input: list[dict[str, list[float]]],
        params: dict[str, Any] | None = None,
    ) -> list[dict[str, list[float]]]:
        return  [{"output": [y * 2.0 for y in x["input"]]} for x in model_input]

Based on the Python type hints, MLflow generates the following signature:

signature:
  inputs: '[{"type": "map", "values": {"type": "array", "items": {"type": "double"}},
    "required": true}]'
  outputs: '[{"type": "map", "values": {"type": "array", "items": {"type": "double"}},
    "required": true}]'
  params: null

MLServer then converts this signature into the following model metadata:

{
    "name": "my_model",
    "versions": [],
    "platform": "",
    "inputs": [
        {
            "name": "input-0",
            "datatype": "BYTES",
            "shape": [-1, 1],
            "parameters": {
                "content_type": "pd_json"
            }
        }
    ],
    "outputs": [
        {
            "name": "output-0",
            "datatype": "BYTES",
            "shape": [-1, 1],
            "parameters": {
                "content_type": "pd_json"
            }
        }
    ],
    "parameters": {
        "content_type": "pd"
    }
}

Which seems correct according to #2080.

When I try to perform an inference with the following request body:

{
    "inputs": [
        {
            "name": "input-0",
            "datatype": "BYTES",
            "shape": [-1, 1],
            "parameters": {
                "content_type": "pd_json"
            },
            "data": ["{\"input\": [1.2, 2.3, 3.4]}"]
        }
    ]
}

I get the following error:

Traceback (most recent call last):
 File "/app/src/ai_serve/dataplane.py", line 67, in infer
 result = await super().infer(payload, name, version)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver/handlers/dataplane.py", line 112, in infer
 prediction = await model.predict(payload)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver_mlflow/runtime.py", line 203, in predict
 return self.encode_response(model_output, default_codec=TensorDictCodec)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver/model.py", line 227, in encode_response
 return default_codec.encode_response(self.name, payload, self.version)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver_mlflow/codecs.py", line 45, in encode_response
 for name, value in payload.items()
 ^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'items'

After investigation, the decoding of the request works perfectly and the predict() function is called successfully. However, encoding the response has failed because it is trying to encode the response using the TensorDictCodec which is incorrect. It seems to be using the default codec instead of the model metadata codec. I think this is related to the following comment:

https://github.com/SeldonIO/MLServer/blob/1d1f3ee42f96744d809aca941ed2925347d198e9/mlserver/codecs/utils.py#L90-L98

Is my assumption correct? If so, how can we use the model's metadata to choose the right codec?

Jun 25 '25 15:06 JRial95

Hi, @JRial95! MLServer tries to encode your output using one of the registered codecs, but can't find any. Thus, it is eventually using the TensorDictCodec which is the default for mlflow-runtime and it fails since the output of your model is Dict[str, np.ndarray].

There are two options for your case.

Option 1:

        return pd.DataFrame.from_dict({
            "output": [[y * 2.0 for y in x["input"]] for x in model_input]  
        })

Option 2:

        return {
            "output": np.array([[y * 2.0 for y in x["input"]] for x in model_input])
        }

If none of the above work for you, you can try to create a custom runtime.

Jun 30 '25 09:06 RobertSamoilescu