Pipeline failing with a 503 error when using `-1` dimensions in shape parameter
Describe the bug
I have a simple model taking one input with datatype "BYTES". I've implemented this as an mlserver custom runtime, deployed it and can successfully call it.
Wrapping the model in a pipeline and deploying the pipeline works, but attempting to call it with the same request results in a 503 error and the following stacktrace in the pipeline gateway:
time="2023-06-15T14:27:20Z" level=debug msg="Seldon model header bytes-chain.pipeline and seldon internal model header [bytes-chain.pipeline]" func=inferModel source=GatewayHttpServer
2023/06/15 14:27:20 http: panic serving 172.19.0.3:37406: runtime error: makeslice: len out of range
goroutine 14410 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1850 +0xbf
panic({0x1ff9ae0, 0x25bf5e0})
/usr/local/go/src/runtime/panic.go:890 +0x262
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.func1()
/go/pkg/mod/go.opentelemetry.io/otel/[email protected]/trace/span.go:359 +0x2a
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End(0xc000158000, {0x0, 0x0, 0x14?})
/go/pkg/mod/go.opentelemetry.io/otel/[email protected]/trace/span.go:398 +0x8ee
panic({0x1ff9ae0, 0x25bf5e0})
/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.convertTensors(0xc002016540)
/build/scheduler/pkg/kafka/pipeline/v2.go:413 +0x198
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.convertToInferenceRequest({0xc00052c000, 0xe4, 0x200})
/build/scheduler/pkg/kafka/pipeline/v2.go:433 +0xb8
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.convertRequestToV2({0xc00052c000?, 0x25c0020?, 0x0?}, {0x0, 0x0}, {0x0, 0x0})
/build/scheduler/pkg/kafka/pipeline/v2.go:197 +0x32
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.ConvertRequestToV2Bytes({0xc00052c000?, 0xc001fce140?, 0x4?}, {0x0?, 0x25bda60?}, {0x0?, 0x10?})
/build/scheduler/pkg/kafka/pipeline/v2.go:189 +0x28
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.(*GatewayHttpServer).infer(0xc0005995c0, {0x25e8320, 0xc002016480}, 0xc0005acc00, {0xc0001c8180, 0xb}, 0x0?)
/build/scheduler/pkg/kafka/pipeline/httpserver.go:151 +0xfd
github.com/seldonio/seldon-core/scheduler/v2/pkg/kafka/pipeline.(*GatewayHttpServer).inferModel(0xc0005995c0, {0x25e8320, 0xc002016480}, 0xc0021ba080?)
/build/scheduler/pkg/kafka/pipeline/httpserver.go:216 +0x1b5
net/http.HandlerFunc.ServeHTTP(0x25e74b0?, {0x25e8320?, 0xc002016480?}, 0x232dddf?)
/usr/local/go/src/net/http/server.go:2109 +0x2f
go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/otelmux.traceware.ServeHTTP({{0x2320864, 0xf}, {0x25c8ec0, 0xc0003ae000}, {0x25e79c0, 0xc0004c1b78}, {0x25d2760, 0xc00034a360}}, {0x25e74b0, 0xc000248380}, ...)
/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/[email protected]/mux.go:145 +0x593
github.com/gorilla/mux.CORSMethodMiddleware.func1.1({0x25e74b0, 0xc000248380}, 0xc00038d230?)
/go/pkg/mod/github.com/gorilla/[email protected]/middleware.go:51 +0xaa
net/http.HandlerFunc.ServeHTTP(0xc0005ac300?, {0x25e74b0?, 0xc000248380?}, 0xc0005b99e0?)
/usr/local/go/src/net/http/server.go:2109 +0x2f
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000348300, {0x25e74b0, 0xc000248380}, 0xc0005ac000)
/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210 +0x1cf
net/http.serverHandler.ServeHTTP({0x25d9dc0?}, {0x25e74b0, 0xc000248380}, 0xc0005ac000)
/usr/local/go/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc0001441e0, {0x25e8cc8, 0xc00040a510})
/usr/local/go/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3102 +0x4db
To reproduce
Create the following files:
# bytes_model.py
from typing import List
from mlserver import MLModel
from mlserver.codecs import decode_args
class BYTESModel(MLModel):
async def load(self) -> bool:
return True
@decode_args
async def predict(self, text: List[str]) -> List[str]:
# just pass through
return text
model-settings.json:
{
"model": "bytes-model",
"implementation": "bytes_model.BYTESModel"
}
# bytes-model.yaml
apiVersion: mlops.seldon.io/v1alpha1
kind: Model
metadata:
name: bytes-model
spec:
storageUri: "/mnt/models/bytes-model"
requirements:
- mlserver
- python
apiVersion: mlops.seldon.io/v1alpha1
kind: Pipeline
metadata:
name: bytes-pipeline
spec:
steps:
- name: bytes-model
output:
steps:
- bytes-model
bytes-request-rest.json:
{
"inputs": [
{
"name": "text",
"shape": [-1, 1],
"datatype": "BYTES",
"parameters": {
"content_type": "str"
},
"data": [
"Hello",
"world"
]
}
]
}
Deploy the model and the pipeline. Call the model which works as expected:
seldon model infer bytes-model "$(cat bytes-request-rest.json)"|jq
{
"model_name": "bytes-model_1",
"model_version": "1",
"id": "8038de41-2d6e-4f3e-9135-af69ba83f58c",
"parameters": {},
"outputs": [
{
"name": "output-0",
"shape": [
2,
1
],
"datatype": "BYTES",
"parameters": {
"content_type": "str"
},
"data": [
"Hello",
"world"
]
}
]
}
But calling the pipeline fails:
seldon pipeline infer bytes-pipeline "$(cat bytes-request-rest.json)"|jq
seldon pipeline infer bytes-chain "$(cat bytes-request-rest.json)"|jq
Error: V2 server error: 503 upstream connect error or disconnect/reset before headers. reset reason: connection termination
Expected behaviour
Call to pipeline succeeds and gives the same output as calling the model directly
Environment
Local scv2 deployment with docker-compose running on latest v2 branch.
Model Details
See above.
I think it's a bug in the seldon client as I was able to successfully send a REST request by other means to the pipeline endpoint http://0.0.0.0:9000/v2/pipelines/bytes-chain/infer.
EDIT: this worked because I was inadvertently using a concrete shape [2, 1] instead of the dynamic shape [-1, 1]. See next message for more context.
UPDATE: I found out that the culprit was the -1 in the shape parameter of the request. This works when calling the model endpoint but fails when calling the pipeline endpoint. Modifying the -1 to a concrete value (here 2) solves the problem.
I'm not sure if this is to be considered a bug , but I would think the pipeline endpoint would be fully OIP compatible?
Having the same issue
Having the same issue also