seldon-core Pipeline failing with a 503 error when using `-1` dimensions in shape parameter

Describe the bug

I have a simple model taking one input with datatype "BYTES". I've implemented this as an mlserver custom runtime, deployed it and can successfully call it.

Wrapping the model in a pipeline and deploying the pipeline works, but attempting to call it with the same request results in a 503 error and the following stacktrace in the pipeline gateway:

time="2023-06-15T14:27:20Z" level=debug msg="Seldon model header bytes-chain.pipeline and seldon internal model header [bytes-chain.pipeline]" func=inferModel source=GatewayHttpServer
2023/06/15 14:27:20 http: panic serving 172.19.0.3:37406: runtime error: makeslice: len out of range
goroutine 14410 [running]:
net/http.(*conn).serve.func1()
	/usr/local/go/src/net/http/server.go:1850 +0xbf
panic({0x1ff9ae0, 0x25bf5e0})
	/usr/local/go/src/runtime/panic.go:890 +0x262
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.func1()
	/go/pkg/mod/go.opentelemetry.io/otel/[email protected]/trace/span.go:359 +0x2a
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End(0xc000158000, {0x0, 0x0, 0x14?})
	/go/pkg/mod/go.opentelemetry.io/otel/[email protected]/trace/span.go:398 +0x8ee
panic({0x1ff9ae0, 0x25bf5e0})
	/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.convertTensors(0xc002016540)
	/build/scheduler/pkg/kafka/pipeline/v2.go:413 +0x198
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.convertToInferenceRequest({0xc00052c000, 0xe4, 0x200})
	/build/scheduler/pkg/kafka/pipeline/v2.go:433 +0xb8
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.convertRequestToV2({0xc00052c000?, 0x25c0020?, 0x0?}, {0x0, 0x0}, {0x0, 0x0})
	/build/scheduler/pkg/kafka/pipeline/v2.go:197 +0x32
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.ConvertRequestToV2Bytes({0xc00052c000?, 0xc001fce140?, 0x4?}, {0x0?, 0x25bda60?}, {0x0?, 0x10?})
	/build/scheduler/pkg/kafka/pipeline/v2.go:189 +0x28
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.(*GatewayHttpServer).infer(0xc0005995c0, {0x25e8320, 0xc002016480}, 0xc0005acc00, {0xc0001c8180, 0xb}, 0x0?)
	/build/scheduler/pkg/kafka/pipeline/httpserver.go:151 +0xfd
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.(*GatewayHttpServer).inferModel(0xc0005995c0, {0x25e8320, 0xc002016480}, 0xc0021ba080?)
	/build/scheduler/pkg/kafka/pipeline/httpserver.go:216 +0x1b5
net/http.HandlerFunc.ServeHTTP(0x25e74b0?, {0x25e8320?, 0xc002016480?}, 0x232dddf?)
	/usr/local/go/src/net/http/server.go:2109 +0x2f
go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/otelmux.traceware.ServeHTTP({{0x2320864, 0xf}, {0x25c8ec0, 0xc0003ae000}, {0x25e79c0, 0xc0004c1b78}, {0x25d2760, 0xc00034a360}}, {0x25e74b0, 0xc000248380}, ...)
	/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/[email protected]/mux.go:145 +0x593
github.com/gorilla/mux.CORSMethodMiddleware.func1.1({0x25e74b0, 0xc000248380}, 0xc00038d230?)
	/go/pkg/mod/github.com/gorilla/[email protected]/middleware.go:51 +0xaa
net/http.HandlerFunc.ServeHTTP(0xc0005ac300?, {0x25e74b0?, 0xc000248380?}, 0xc0005b99e0?)
	/usr/local/go/src/net/http/server.go:2109 +0x2f
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000348300, {0x25e74b0, 0xc000248380}, 0xc0005ac000)
	/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210 +0x1cf
net/http.serverHandler.ServeHTTP({0x25d9dc0?}, {0x25e74b0, 0xc000248380}, 0xc0005ac000)
	/usr/local/go/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc0001441e0, {0x25e8cc8, 0xc00040a510})
	/usr/local/go/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:3102 +0x4db

To reproduce

Create the following files:

# bytes_model.py
from typing import List

from mlserver import MLModel
from mlserver.codecs import decode_args


class BYTESModel(MLModel):
    async def load(self) -> bool:
        return True

    @decode_args
    async def predict(self, text: List[str]) -> List[str]:
        # just pass through
        return text

model-settings.json:

{
  "model": "bytes-model",
  "implementation": "bytes_model.BYTESModel"
}

# bytes-model.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
  name: bytes-model
spec:
  storageUri: "/mnt/models/bytes-model"
  requirements:
  - mlserver
  - python

apiVersion: mlops.seldon.io/v1alpha1
kind: Pipeline
metadata:
  name: bytes-pipeline
spec:
  steps:
    - name: bytes-model
  output:
    steps:
    - bytes-model

bytes-request-rest.json:

{
  "inputs": [
    {
      "name": "text",
      "shape": [-1, 1],
      "datatype": "BYTES",
      "parameters": {
        "content_type": "str"
      },
      "data": [
          "Hello",
          "world"
      ]
    }
  ]
}

Deploy the model and the pipeline. Call the model which works as expected:

seldon model infer bytes-model "$(cat bytes-request-rest.json)"|jq

{
  "model_name": "bytes-model_1",
  "model_version": "1",
  "id": "8038de41-2d6e-4f3e-9135-af69ba83f58c",
  "parameters": {},
  "outputs": [
    {
      "name": "output-0",
      "shape": [
        2,
        1
      ],
      "datatype": "BYTES",
      "parameters": {
        "content_type": "str"
      },
      "data": [
        "Hello",
        "world"
      ]
    }
  ]
}

But calling the pipeline fails:

seldon pipeline infer bytes-pipeline "$(cat bytes-request-rest.json)"|jq

seldon pipeline infer bytes-chain "$(cat bytes-request-rest.json)"|jq
Error: V2 server error: 503 upstream connect error or disconnect/reset before headers. reset reason: connection termination

Expected behaviour

Call to pipeline succeeds and gives the same output as calling the model directly

Environment

Local scv2 deployment with docker-compose running on latest v2 branch.

Model Details

See above.

Jun 15 '23 14:06 jklaise

I think it's a bug in the seldon client as I was able to successfully send a REST request by other means to the pipeline endpoint http://0.0.0.0:9000/v2/pipelines/bytes-chain/infer.

EDIT: this worked because I was inadvertently using a concrete shape [2, 1] instead of the dynamic shape [-1, 1]. See next message for more context.

Jun 16 '23 12:06 jklaise

UPDATE: I found out that the culprit was the -1 in the shape parameter of the request. This works when calling the model endpoint but fails when calling the pipeline endpoint. Modifying the -1 to a concrete value (here 2) solves the problem.

I'm not sure if this is to be considered a bug , but I would think the pipeline endpoint would be fully OIP compatible?

Jun 16 '23 13:06 jklaise

Having the same issue

Jan 15 '24 15:01 miguelappleton

Having the same issue also

Jan 15 '24 15:01 ssamora