AUTO DnsLookupFamily not working with Serverless VPC Connector
Hello!
I got the basic setup described by the Endpoints for Cloud Run with ESPv2 tutorial:
- ESPv2 deployed in cloud run
- a cloud function
My goal is to make the function internal only which, from my understanding, requires that all egress traffic goes through a serverless VPC Connector.
When I add the VPC Connector though, I get 503 errors upstream connect error or disconnect/reset before headers. reset reason: connection failure (same as #170 ).
Just like #170 , I got it working by adding --backend_dns_lookup_family=v4only to ESPv2_ARGS. The function domain ....cloudfunctions.net does support IPv6. Is that expected behavior?
Thanks!
Unrelated to the question you asked, but one thing stuck out to me in your architecture:
My goal is to make the function internal only
Can you clarify what you mean by this? From my understanding, a Serverless VPC connector will allow your function access resources in your VPC network. But it will not isolate the function into this VPC network, it still receives traffic from outside your VPC. This diagram shows this: https://cloud.google.com/vpc/images/serverless-vpc-access.svg
Can you also clarify if both the Cloud Run and Cloud Function are connected to the Serverless VPC connector?
To answer your question, I found this statement on VPCs (https://cloud.google.com/vpc/docs/vpc):
VPC networks only support IPv4 unicast traffic. They do not support broadcast, multicast, or IPv6 traffic within the network; VMs in the VPC network can only send to IPv4 destinations and only receive traffic from IPv4 sources.
If all egress traffic is going through the VPC, that would explain why setting strictly v4 works. I am confused why the default AUTO behavior does not work though. If AUTO is specified, the DNS resolver will first perform a lookup for addresses in the IPv6 family and fallback to a lookup for addresses in the IPv4 family. Seems the fallback mechanism is not working, perhaps something we can investigate further.
Apologies for not being clear, I was referring to Ingress Traffic (see Ingress settings). Internal only means Only requests from VPC networks in the same project can reach the function according to the docs. It results in a 403 otherwise, regardless of what auth is provided.
both the Cloud Run and Cloud Function are connected to the Serverless VPC connector?
Yes, I am not sure if adding the connector to the function makes a difference though? If I understood correctly it only concerns outbound traffic? I added the connector to the function so that it can reach a database with no public IP.
VPC networks only support IPv4 unicast traffic
Arf! Thank you for showing me this.. No idea how I missed it. This makes a lot of sense.
Seems the fallback mechanism is not working
Any information I can provide? If that helps, I can open the function to all users and the outside (we are in early development so it does not do anything sensitive) and provide a bit of the endpoints config although it is pretty much like the tutorial.
Thanks for the clarification, your architecture makes sense. I didn't know Cloud Functions had that functionality. Yes, it sounds like you need to keep the connector for egress traffic from your Cloud Function too, ingress and egress are two separate settings.
Any information I can provide?
I should be able to reproduce the problem in our test project. It will be easier for us to debug since we can enable verbose logging and filter through Cloud Logging. Thanks for providing the detailed information!
Setting additional flag --backend_dns_lookup_family=v4only worked for us as well with esp base image gcr.io/endpoints-release/endpoints-runtime-serverless:2.31.0. But we were curious to know if this issue will be resolved in future?
We confirmed that IPv6 routes are not configured by serverless VPC connector. We may consider switching ESPv2 to default to IPv4 only if more users run into this issue.
I think IPV4 should work for all networks. It should be safe to default to IPV4Only. Or only for remote backends?
I was facing this problem as well, trying to connect to a Cloud Run app with a serverless VPC from ESP v2 on Cloud Run, and it was fairly hard to know why this wasn't working (based on the errors)... Since its been more than 1.5 yrs that this issue is open, it would be great to either have better error message, updated documentation or making ipv4 as the default / fallback if that is possible.
Ok, we will change the default to IPv4 for the next release. Thanks