Gateway does not become ready using helm install and quickstart
Description:
The gateway pod fails to become ready. It is logging the following errors:
$ k logs envoy-default-eg-e41e7b31-6b65bcdbd-s6dxd
Defaulted container "envoy" out of: envoy, shutdown-manager, debugger-tstmb (ephem)
[2025-06-19 22:39:29.015][1][warning][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:130] gRPC config: initial fetch timed out for type.googleapis.com/envoy.config.cluster.v3.Cluster
[2025-06-19 22:39:44.014][1][warning][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:130] gRPC config: initial fetch timed out for type.googleapis.com/envoy.config.listener.v3.Listener
[2025-06-19 22:39:58.328][1][warning][config] [./source/extensions/config_subscription/grpc/grpc_stream.h:226] DeltaAggregatedResources gRPC config stream to xds_cluster closed since 44s ago: 14, no healthy upstream
[2025-06-19 22:40:21.671][1][warning][config] [./source/extensions/config_subscription/grpc/grpc_stream.h:226] DeltaAggregatedResources gRPC config stream to xds_cluster closed since 67s ago: 14, no healthy upstream
I attached a debug pod and successfully connect to the configured xds_cluster:
telnet envoy-gateway.envoy-gateway-system.svc.cluster.local 18000
There are no applicable logs from the controller.
Repro steps:
$ helm install eg oci://docker.io/envoyproxy/gateway-helm --version v1.4.1 -n envoy-gateway-system --create-namespace
Pulled: docker.io/envoyproxy/gateway-helm:v1.4.1
Digest: sha256:e5caac1557603bf3284efb2a3b58b6c28576949cabf0adb81bd0c2e7962a74aa
NAME: eg
LAST DEPLOYED: Thu Jun 19 17:00:19 2025
NAMESPACE: envoy-gateway-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
**************************************************************************
*** PLEASE BE PATIENT: Envoy Gateway may take a few minutes to install ***
**************************************************************************
Envoy Gateway is an open source project for managing Envoy Proxy as a standalone or Kubernetes-based application gateway.
Thank you for installing Envoy Gateway! 🎉
Your release is named: eg. 🎉
Your release is in namespace: envoy-gateway-system. 🎉
To learn more about the release, try:
$ helm status eg -n envoy-gateway-system
$ helm get all eg -n envoy-gateway-system
To have a quickstart of Envoy Gateway, please refer to https://gateway.envoyproxy.io/latest/tasks/quickstart.
To get more details, please visit https://gateway.envoyproxy.io and https://github.com/envoyproxy/gateway.
$ kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available
deployment.apps/envoy-gateway condition met
$ kubectl apply -f https://github.com/envoyproxy/gateway/releases/download/latest/quickstart.yaml -n default
gatewayclass.gateway.networking.k8s.io/eg created
gateway.gateway.networking.k8s.io/eg created
serviceaccount/backend unchanged
service/backend unchanged
deployment.apps/backend unchanged
httproute.gateway.networking.k8s.io/backend created
$ k get pod
NAME READY STATUS RESTARTS AGE
envoy-default-eg-e41e7b31-6b65bcdbd-pwckb 0/2 Running 0 1s
envoy-gateway-7c88d4fff4-qr8nw 1/1 Running 0 32s
Environment:
talos v1.10.4 cluster metallb LoadBalancer provider Kube 1.33
Logs:
envoy-gatway-* logs:
2025-06-19T23:00:28.240Z INFO config-loader loader/configloader.go:106 running hook
2025-06-19T23:00:28.240Z INFO config-loader loader/configloader.go:48 watching for changes to the EnvoyGateway configuration {"path": "/config/envoy-gateway.yaml"}
2025-06-19T23:00:28.240Z INFO cmd/server.go:67 Start runners
2025-06-19T23:00:28.240Z INFO admin admin/server.go:34 starting admin server {"address": "127.0.0.1:19000", "enablePprof": false}
2025-06-19T23:00:28.240Z INFO cmd/server.go:277 Starting runner {"name": "provider"}
2025-06-19T23:00:28.240Z INFO metrics metrics/register.go:179 initialized metrics pull endpoint {"address": "0.0.0.0:19001", "endpoint": "/metrics"}
2025-06-19T23:00:28.241Z INFO metrics metrics/register.go:62 starting metrics server {"address": "0.0.0.0:19001"}
2025-06-19T23:00:28.244Z INFO provider.controller-runtime.webhook webhook/server.go:183 Registering webhook {"runner": "provider", "path": "/inject-pod-topology"}
2025-06-19T23:00:28.244Z INFO provider kubernetes/controller.go:141 created gatewayapi controller {"runner": "provider"}
2025-06-19T23:00:28.280Z INFO provider kubernetes/controller.go:1525 ServiceImport CRD not found, skipping ServiceImport watch {"runner": "provider"}
2025-06-19T23:00:28.318Z INFO provider kubernetes/controller.go:1879 Watching gatewayAPI related objects {"runner": "provider"}
2025-06-19T23:00:28.322Z INFO provider runner/runner.go:66 Running provider {"runner": "provider", "type": "Kubernetes"}
2025-06-19T23:00:28.322Z INFO cmd/server.go:277 Starting runner {"name": "gateway-api"}
2025-06-19T23:00:28.322Z INFO gateway-api runner/runner.go:91 started {"runner": "gateway-api"}
2025-06-19T23:00:28.322Z INFO cmd/server.go:277 Starting runner {"name": "xds-translator"}
2025-06-19T23:00:28.322Z INFO xds-translator runner/runner.go:53 started {"runner": "xds-translator"}
2025-06-19T23:00:28.322Z INFO cmd/server.go:277 Starting runner {"name": "infrastructure"}
2025-06-19T23:00:28.322Z INFO provider.controller-runtime.metrics server/server.go:208 Starting metrics server {"runner": "provider"}
2025-06-19T23:00:28.322Z INFO provider.controller-runtime.metrics server/server.go:247 Serving metrics server {"runner": "provider", "bindAddress": ":8080", "secure": false}
2025-06-19T23:00:28.323Z INFO cmd/server.go:277 Starting runner {"name": "xds-server"}
2025-06-19T23:00:28.323Z INFO provider manager/server.go:83 starting server {"runner": "provider", "name": "health probe", "addr": "[::]:8081"}
2025-06-19T23:00:28.323Z INFO provider.controller-runtime.webhook webhook/server.go:191 Starting webhook server {"runner": "provider"}
2025-06-19T23:00:28.323Z INFO xds-server runner/runner.go:98 loaded TLS certificate and key {"runner": "xds-server"}
2025-06-19T23:00:28.323Z INFO provider.controller-runtime.certwatcher certwatcher/certwatcher.go:211 Updated current TLS certificate {"runner": "provider"}
2025-06-19T23:00:28.323Z INFO xds-server runner/runner.go:149 started {"runner": "xds-server"}
2025-06-19T23:00:28.323Z INFO provider.controller-runtime.webhook webhook/server.go:242 Serving webhook server {"runner": "provider", "host": "", "port": 9443}
2025-06-19T23:00:28.323Z INFO provider.controller-runtime.certwatcher certwatcher/certwatcher.go:133 Starting certificate poll+watcher {"runner": "provider", "interval": "10s"}
2025-06-19T23:00:28.330Z INFO wasm-cache wasm/httpserver.go:111 Listening on :18002
2025-06-19T23:00:28.424Z INFO provider leaderelection/leaderelection.go:257 attempting to acquire leader lease envoy-gateway-system/5b9825d2.gateway.envoyproxy.io... {"runner": "provider"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "*kubernetes.watchAndReconcileSource"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.GatewayClass"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.Gateway"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha1.EnvoyProxy"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha1.HTTPRouteFilter"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.Secret"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.HTTPRoute"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha1.ClientTrafficPolicy"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha1.BackendTrafficPolicy"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.GRPCRoute"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.ConfigMap"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha2.TLSRoute"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1beta1.ReferenceGrant"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.Deployment"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha2.UDPRoute"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.DaemonSet"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha2.TCPRoute"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha1.SecurityPolicy"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha1.EnvoyExtensionPolicy"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.EndpointSlice"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.Node"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1alpha3.BackendTLSPolicy"}
2025-06-19T23:00:28.424Z INFO provider controller/controller.go:204 Starting EventSource {"runner": "provider", "controller": "gatewayapi-1750374028", "source": "kind source: *v1.Service"}
2025-06-19T23:00:28.444Z INFO provider leaderelection/leaderelection.go:271 successfully acquired lease envoy-gateway-system/5b9825d2.gateway.envoyproxy.io {"runner": "provider"}
2025-06-19T23:00:28.445Z INFO provider kubernetes/status_updater.go:134 started status update handler {"runner": "provider"}
2025-06-19T23:00:28.445Z INFO infrastructure runner/runner.go:75 started {"runner": "infrastructure"}
2025-06-19T23:00:28.526Z INFO provider controller/controller.go:239 Starting Controller {"runner": "provider", "controller": "gatewayapi-1750374028"}
2025-06-19T23:00:28.526Z INFO provider controller/controller.go:248 Starting workers {"runner": "provider", "controller": "gatewayapi-1750374028", "worker count": 1}
2025-06-19T23:00:28.526Z INFO provider kubernetes/controller.go:190 reconciling gateways {"runner": "provider"}
2025-06-19T23:00:28.526Z INFO provider kubernetes/controller.go:201 no accepted gatewayclass {"runner": "provider"}
2025-06-19T23:00:58.124Z INFO provider kubernetes/predicates.go:41 gatewayclass has matching controller name, processing {"runner": "provider", "name": "eg"}
2025-06-19T23:00:58.124Z INFO provider kubernetes/controller.go:190 reconciling gateways {"runner": "provider"}
2025-06-19T23:00:58.225Z INFO provider kubernetes/controller.go:774 processing OIDC HMAC Secret {"runner": "provider", "namespace": "envoy-gateway-system", "name": "envoy-oidc-hmac"}
2025-06-19T23:00:58.325Z INFO provider kubernetes/controller.go:338 No gateways found for accepted gatewayClass {"runner": "provider"}
2025-06-19T23:00:58.326Z INFO provider kubernetes/controller.go:362 reconciled gateways successfully {"runner": "provider"}
2025-06-19T23:00:58.326Z INFO provider kubernetes/controller.go:190 reconciling gateways {"runner": "provider"}
2025-06-19T23:00:58.326Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "", "name": "eg"}
2025-06-19T23:00:58.326Z INFO provider kubernetes/controller.go:1073 processing Gateway {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.326Z INFO gateway-api runner/runner.go:129 received an update {"runner": "gateway-api"}
2025-06-19T23:00:58.326Z INFO provider kubernetes/controller.go:774 processing OIDC HMAC Secret {"runner": "provider", "namespace": "envoy-gateway-system", "name": "envoy-oidc-hmac"}
2025-06-19T23:00:58.426Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "", "name": "eg"}
2025-06-19T23:00:58.426Z INFO provider.eg kubernetes/status_updater.go:109 status unchanged, bypassing update {"runner": "provider"}
2025-06-19T23:00:58.434Z INFO provider.KubeAPIWarningLogger log/warning_handler.go:65 metadata.finalizers: "gateway-exists-finalizer.gateway.networking.k8s.io": prefer a domain-qualified finalizer name including a path (/) to avoid accidental conflicts with other finalizer writers {"runner": "provider"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:362 reconciled gateways successfully {"runner": "provider"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:190 reconciling gateways {"runner": "provider"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:1073 processing Gateway {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.434Z INFO gateway-api runner/runner.go:129 received an update {"runner": "gateway-api"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/routes.go:234 processing HTTPRoute {"runner": "provider", "namespace": "default", "name": "backend"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:774 processing OIDC HMAC Secret {"runner": "provider", "namespace": "envoy-gateway-system", "name": "envoy-oidc-hmac"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:423 processing Backend {"runner": "provider", "kind": "Service", "namespace": "default", "name": "backend"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:437 added Service to resource tree {"runner": "provider", "namespace": "default", "name": "backend"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:538 added EndpointSlice to resource tree {"runner": "provider", "namespace": "default", "name": "backend-66rnx"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/controller.go:362 reconciled gateways successfully {"runner": "provider"}
2025-06-19T23:00:58.434Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "", "name": "eg"}
2025-06-19T23:00:58.434Z INFO provider.eg kubernetes/status_updater.go:109 status unchanged, bypassing update {"runner": "provider"}
2025-06-19T23:00:58.436Z INFO infrastructure runner/runner.go:100 received an update {"runner": "infrastructure"}
2025-06-19T23:00:58.437Z INFO gateway-api runner/runner.go:129 received an update {"runner": "gateway-api"}
2025-06-19T23:00:58.437Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.437Z INFO xds-translator runner/runner.go:61 received an update {"runner": "xds-translator"}
2025-06-19T23:00:58.440Z INFO xds-translator runner/runner.go:61 received an update {"runner": "xds-translator"}
2025-06-19T23:00:58.440Z INFO xds-server runner/runner.go:195 received an update {"runner": "xds-server"}
2025-06-19T23:00:58.441Z INFO xds-server runner/runner.go:195 received an update {"runner": "xds-server"}
2025-06-19T23:00:58.450Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "backend"}
2025-06-19T23:00:58.463Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.479Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.479Z INFO provider.eg.default kubernetes/status_updater.go:109 status unchanged, bypassing update {"runner": "provider"}
2025-06-19T23:00:58.508Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.508Z INFO provider.eg.default kubernetes/status_updater.go:109 status unchanged, bypassing update {"runner": "provider"}
2025-06-19T23:00:58.552Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.553Z INFO provider.eg.default kubernetes/status_updater.go:109 status unchanged, bypassing update {"runner": "provider"}
2025-06-19T23:00:58.584Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.613Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:00:58.625Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:10:59.032Z INFO provider kubernetes/status_updater.go:145 received a status update {"runner": "provider", "namespace": "default", "name": "eg"}
2025-06-19T23:10:59.032Z INFO provider.eg.default kubernetes/status_updater.go:109 status unchanged, bypassing update {"runner": "provider"}
[2025-06-19 22:39:29.015][1][warning][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:130] gRPC config: initial fetch timed out for type.googleapis.com/envoy.config.cluster.v3.Cluster
proxy need to connect to controller plane via a DNS cluster, please make sure you cluster network is working, especially for cross node.
As I said in the original post, I attached a debug container to the proxy pod and was able to successful telnet:
$ telnet envoy-gateway.envoy-gateway-system.svc.cluster.local 18000
I believe this means the pod is able to connect to envoy-gateway.envoy-gateway-system.svc.cluster.local:18000, isn't that what the logs are saying it can't connect to? That implies to me that both DNS and network connectivity is working so I'm at a loss to what/why it's failing.
I also have a lot of stuff working in this cluster that requires cross node communication, so the basic kube networking/dns is definitely working in general.
I'm also having this issue on Talos Linux 1.10.4. I've tried both the Envoy Helm chart and applying the Kubernetes manifest directly.
@kinghrothgar I got this working. I set kubernetesClusterDomain to the value of my Talos cluster.network.dnsDomain.
@jordanstacy this unfortunately doesn't appear to be my issue. My talos cluster domain is the default cluster.local. I looked through the gateway instance pod, and all the domains I found in the config were correct and I was able to telnet to them from an attached debug container. At a loss on how to debug this more other than getting a tcpdump.
@kinghrothgar Are you by chance using an IPv6 cluster? I am seeing a pretty similar issue on an EKS IPv6 cluster.
Downgrading to v1.3.3 fixed the issue for me (from 1.4.2). I suspect https://github.com/envoyproxy/gateway/pull/5197 (first in 1.4.0) may be to blame for the issue, but not sure.
@cnemo-cenic can you try v1.5.0-rc.2 ?
Tried, seems to work
cool, could then be related to https://github.com/envoyproxy/gateway/pull/6591, which implies folks on this thread have some custom resolv.conf that was not playing well with the FQDN before
closing this issue for now, feel free to raise a new one if this issue persists in v1.5
Not sure if it helps, but here is the resolv.conf from a similar pod on the cluster is as follows:
search sandbox.svc.cluster.local svc.cluster.local cluster.local us-west-1.compute.internal
nameserver fde2:2480:190d::a
options ndots:5
I couldn't kubectl cp from the envoy gateway pod itself unfortunately due to tar missing, but I doubt the resolv.conf there is much different.
Just tested it and it is indeed fixed in v1.5 for me! Thank y'all.