`SecretDiscoveryServiceServer`: `StreamSecrets` issues
Issues
I've noticed a couple of things.
- When we implement
SecretDiscoveryServiceServer"github.com/envoyproxy/go-control-plane/envoy/service/secret/v3" sometimes theStreamSecretsdoes not get called back at all. - when
StreamSecretsis called, it is called endlessly with a new stream each time, see the log output further down in this message.
When StreamSecrets is not called, obviously, our dynamically supplied secrets are titled dynamic_warming_secrets instead of dynamic_active_secrets.
Code
Our code is public so, here are the definitions of:
-
StreamSecrets: https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/sds/sds.go#L72 - Where we register the
SecretDiscoveryServiceServer: https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/manager/envoy_config_manager.go#L183 - Configuring the
cacheetc: https://github.com/kubeshop/kusk-gateway/blob/mbana-oauth-issue-401-sds/internal/envoy/manager/envoy_config_manager.go#L56.
Log of StreamSecrets being called multiple times (the stream=&{0xc000f19c70} is different each time):
2022-08-12T09:19:30Z | sds.go:74: SecretDiscoveryServiceServer.StreamSecrets: exiting method
2022-08-12T09:19:30Z | sds.go:92:
2022-08-12T09:19:30Z | sds.go:93: SecretDiscoveryServiceServer.StreamSecrets: calling stream.Recv - stream=&{0xc000f19380}, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:116: SecretDiscoveryServiceServer.StreamSecrets: request.TypeUrl=type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:153: SecretDiscoveryServiceServer.StreamSecrets: stream.Send(response) sent - responses=[<*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"hmac_secret" generic_secret:{secret:{inline_bytes:"f9eckuGEcUNxAqKT0uK8OyM2Se01ukVLPHsiSoTh2X8="}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret" <*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"client_secret" generic_secret:{secret:{inline_string:"Z6MX7NreJumWLmf6unsQ5uiEUrTBxfNtqG9Vy5Kjktnvfj-_fRCBO9EU1mL1YzAJ"}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"]
2022-08-12T09:19:30Z | sds.go:154:
2022-08-12T09:19:30Z | sds.go:74: SecretDiscoveryServiceServer.StreamSecrets: exiting method
2022-08-12T09:19:30Z | sds.go:92:
2022-08-12T09:19:30Z | sds.go:93: SecretDiscoveryServiceServer.StreamSecrets: calling stream.Recv - stream=&{0xc000f19c70}, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:116: SecretDiscoveryServiceServer.StreamSecrets: request.TypeUrl=type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret, len(s.ClientSecrets)=2
2022-08-12T09:19:30Z | sds.go:153: SecretDiscoveryServiceServer.StreamSecrets: stream.Send(response) sent - responses=[<*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"client_secret" generic_secret:{secret:{inline_string:"Z6MX7NreJumWLmf6unsQ5uiEUrTBxfNtqG9Vy5Kjktnvfj-_fRCBO9EU1mL1YzAJ"}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret" <*>version_info:"2022-08-12T09:19:30Z" resources:{[type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret]:{name:"hmac_secret" generic_secret:{secret:{inline_bytes:"f9eckuGEcUNxAqKT0uK8OyM2Se01ukVLPHsiSoTh2X8="}}}} type_url:"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret"]
What we expect when we look at /config_dump of when StreamSecrets is being called:
{
"@type": "type.googleapis.com/envoy.admin.v3.SecretsConfigDump",
"dynamic_active_secrets": [
{
"name": "client_secret",
"version_info": "2022-08-12T09:22:18Z",
"last_updated": "2022-08-12T09:22:18.140Z",
"secret": {
"@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
"name": "client_secret",
"generic_secret": {
"secret": {
"inline_string": "[redacted]"
}
}
}
},
{
"name": "hmac_secret",
"version_info": "2022-08-12T09:22:18Z",
"last_updated": "2022-08-12T09:22:18.386Z",
"secret": {
"@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
"name": "hmac_secret",
"generic_secret": {
"secret": {
"inline_bytes": "W3JlZGFjdGVkXQ=="
}
}
}
}
]
}
Don't worry about the exposed the secrets, they'll be removed soon.
Logs or Debug Information
In addition to that, here are dumps taken from the admin endpoint (/config_dump) of when it is working and when it is not working, i.e., when the StreamSecrets gRPC method is not invoked on the go-control-plane and when it is. Notice how the secrets are called dynamic_warming_secrets.
-
config_dump-broken.json: https://gist.github.com/mbana/61305292ddb9fd83e260a0125893f6ca -
logs-broken.log: https://gist.github.com/mbana/a4636cf5e96a035db618f7c37b8dc275 -
config_dump-working.json: https://gist.github.com/mbana/e9b7b29ed7c1be032aca867304d86d60 -
logs-working.log: https://gist.github.com/mbana/2881bc1873090d8d82649ba95625d789
If there's anything further I can do to help get to the cause of this issue, please let me know.
Envoy Compiled From 7af7608b022c5e3ae2ad110b4d94aa7506a643d7
Edit: I went a step further and compiled Envoy from source and made a Docker image of it:
Dockerfile
FROM docker.io/ubuntu:22.04
COPY envoy-static /usr/local/bin/envoy
COPY envoy-static /usr/bin/envoy
ENTRYPOINT ["/usr/local/bin/envoy"]
Build Steps
$ git remote -vv
origin [email protected]:envoyproxy/envoy.git (fetch)
origin [email protected]:envoyproxy/envoy.git (push)
$ git rev-parse HEAD
7af7608b022c5e3ae2ad110b4d94aa7506a643d7
$ bazel/setup_clang.sh
$ echo "build --config=clang" >> user.bazelrc
$ echo "build --copt=-fno-limit-debug-info" >> user.bazelrc
$ bazel build --jobs=32 -c fastbuild envoy
$ cp bazel-bin/source/exe/envoy-static .
$ docker build --tag ttl.sh/kubeshop/envoy:24h --file ./Dockerfile .
$ docker push ttl.sh/kubeshop/envoy:24h
The image is available at ttl.sh/kubeshop/envoy:24h (docker run --rm -it ttl.sh/kubeshop/envoy:24h). Note: This image is only available for ~24 hours from the time of editing this post (2022-08-12T13:47:31+00:00 UTC).
Segmentation Fault
I noticed that Envoy crashes:
[2022-08-12 13:35:16.253][7][critical][assert] [source/common/init/manager_impl.cc:36] assert failure: false. Details: attempted to add shared target SdsApi client_secret to initialized init manager Server
...
<STACK_TRACE_OMITTED>
...
Our FatalActions triggered a fatal signal.
Segmentation fault (core dumped)
Can anyone see anything useful in this stack-trace? The assertion failing is particular interesting, but I don't know much about the Envoy code-base to tell if this is an issue or not:
[2022-08-12 13:35:16.253][7][critical][assert] [source/common/init/manager_impl.cc:36] assert failure: false. Details: attempted to add shared target SdsApi client_secret to initialized init manager Server