MLflow output codec doesn't follow the model's metadata
I'm trying to execute the following MLflow model with MLServer:
from typing import Any
from mlflow.pyfunc import PythonModel, PythonModelContext
class MyModel(PythonModel):
def predict(
self,
context: PythonModelContext | None,
model_input: list[dict[str, list[float]]],
params: dict[str, Any] | None = None,
) -> list[dict[str, list[float]]]:
return [{"output": [y * 2.0 for y in x["input"]]} for x in model_input]
Based on the Python type hints, MLflow generates the following signature:
signature:
inputs: '[{"type": "map", "values": {"type": "array", "items": {"type": "double"}},
"required": true}]'
outputs: '[{"type": "map", "values": {"type": "array", "items": {"type": "double"}},
"required": true}]'
params: null
MLServer then converts this signature into the following model metadata:
{
"name": "my_model",
"versions": [],
"platform": "",
"inputs": [
{
"name": "input-0",
"datatype": "BYTES",
"shape": [-1, 1],
"parameters": {
"content_type": "pd_json"
}
}
],
"outputs": [
{
"name": "output-0",
"datatype": "BYTES",
"shape": [-1, 1],
"parameters": {
"content_type": "pd_json"
}
}
],
"parameters": {
"content_type": "pd"
}
}
Which seems correct according to #2080.
When I try to perform an inference with the following request body:
{
"inputs": [
{
"name": "input-0",
"datatype": "BYTES",
"shape": [-1, 1],
"parameters": {
"content_type": "pd_json"
},
"data": ["{\"input\": [1.2, 2.3, 3.4]}"]
}
]
}
I get the following error:
Traceback (most recent call last):
File "/app/src/ai_serve/dataplane.py", line 67, in infer
result = await super().infer(payload, name, version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver/handlers/dataplane.py", line 112, in infer
prediction = await model.predict(payload)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver_mlflow/runtime.py", line 203, in predict
return self.encode_response(model_output, default_codec=TensorDictCodec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver/model.py", line 227, in encode_response
return default_codec.encode_response(self.name, payload, self.version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver_mlflow/codecs.py", line 45, in encode_response
for name, value in payload.items()
^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'items'
After investigation, the decoding of the request works perfectly and the predict() function is called successfully. However, encoding the response has failed because it is trying to encode the response using the TensorDictCodec which is incorrect. It seems to be using the default codec instead of the model metadata codec. I think this is related to the following comment:
https://github.com/SeldonIO/MLServer/blob/1d1f3ee42f96744d809aca941ed2925347d198e9/mlserver/codecs/utils.py#L90-L98
Is my assumption correct? If so, how can we use the model's metadata to choose the right codec?
Hi, @JRial95! MLServer tries to encode your output using one of the registered codecs, but can't find any. Thus, it is eventually using the TensorDictCodec which is the default for mlflow-runtime and it fails since the output of your model is Dict[str, np.ndarray].
There are two options for your case.
Option 1:
return pd.DataFrame.from_dict({
"output": [[y * 2.0 for y in x["input"]] for x in model_input]
})
Option 2:
return {
"output": np.array([[y * 2.0 for y in x["input"]] for x in model_input])
}
If none of the above work for you, you can try to create a custom runtime.