Add DNS caching time parameter for multiple servers
I would like to suggest adding the ability to configure the caching time parameter when multiple DNS servers are available. For example, in my configuration, I have two DNS servers: 10.0.0.10 and 8.8.8.8. The first DNS (10.0.0.10) is used via VPN, and if it does not respond, the request will be sent to the second DNS (8.8.8.8). Currently, all subsequent requests go to the second DNS. However, I would like the system to switch back to resolving requests from the first DNS after a specified time.
Perhaps these 20 seconds should be made a configurable parameter: https://github.com/mageddo/dns-proxy-server/blob/17b0c0043d883cf9f902075739183c7938c808b2/src/main/java/com/mageddo/dnsproxyserver/server/dns/solver/SolverRemote.java#L174
This change would allow for more flexible management of DNS request caching and ensure more efficient operation with multiple DNS servers. I would appreciate your consideration of this proposal. Thank you!
Alright, so by some reason 10.0.0.10 is failing, circuit is opening for that server and you would want to customize the circuitbreaker parameters, right?
Just for curiosity do you know what is the 10.0.0.10 failing reason?
Yes, you are right. 10.0.0.10 is the DNS for VPN network. For corporate network resources. When i turn on computer, VPN is not yet connected. At this case DNS will excluded. That`s why i have to restart docker container after connection to the VPN.
Hey, I'm releasing 3.18.1 right now, can you check it solves your usecase after the release finish? Check the jSON config docs of how to use it.
@rayout
Hello! Thank you for your help! I tried to check, but an error occurs during the launch (I tried both with my own config and by replacing the config with the one from the documentation).
Exception in thread "main" com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot construct instance of
com.mageddo.dnsproxyserver.config.dataprovider.vo.ConfigJsonV2$SolverRemote$CircuitBreaker: cannot deserialize from Object value (no delegate- or property-based Creator): this appears to be a native image, in which case you may need to configure reflection for the class that is to be deserialized at [Source: UNKNOWN; byte offset: #UNKNOWN] (through reference chain: com.mageddo.dnsproxyserver.config.dataprovider.vo.ConfigJsonV2["solverRemote"]->com.mageddo.dnsproxyserver.config.dataprovider.vo.ConfigJsonV2$SolverRemote["circuitBreaker"])
Hello! Thank you for your help! I tried to check, but an error occurs during the launch (I tried both with my own config and by replacing the config with the one from the documentation).
Thanks for the feedback, fixing that on #454 and releasing 3.18.2-snapshot
I tested it. I specified two addresses as DNS - 10.0.0.10 and 8.8.8.8. The 10.0.0.10 DNS is only accessible via VPN.
I connected to the VPN and used the dig command to query an address that has an IP within the 10.0.0.10 range. The command was: dig
Next, I disconnected from the VPN and tried the dig command again multiple times. I still received the internal IP 10.0.0.169, even though the site has an external IP address on 8.8.8.8. I waited 10 minutes to check the cache, but I still received the internal address.
What could I have done wrong? If I run dig
@10.0.0.10, it results in an error because I am disconnected from the VPN. Therefore, it should have switched to 8.8.8.8 after some time, but it didn't work.
I still received the internal IP 10.0.0.169, even though the site has an external IP address on 8.8.8.8. I waited 10 minutes to check the cache
I suppose this scenario it's related to the response entries cache, so it's a second scenario, not related to the circuitbreaker. Once query has a successful response then DPS will cache it for the time the remote server specifies, (10.0.0.10 in your case).
In the bellow example, 107 is the TTL in seconds.
$ dig +noall +nocmd +answer google.com
google.com. 107 IN A 142.251.129.238
You can make the same test and manage the cache with the APIs bellow:
See how's the cache
$ curl localhost:5385/v1/caches/
Clear a cache
$ curl -X DELETE localhost:5385/v1/caches?name=REMOTE
@rayout
I'm thinking about what can be done in this scenario.
Please confirm if it really is the scenario
Enable trace debug level and post the output when you test the same scenario again plus the cache clear scenario will help me.
Yes, the problem is indeed with the cache. When looking at /v1/caches/ (thanks, very useful), I can see two entries for the domain I need:
global: ttl": "PT20S"
remote: "ttl": "PT1H"
With the VPN disconnected, I run dig +noall +nocmd +answer
Here is what I see in the console during debugging:
- VPN disconnected:
- VPN connected:
- Cache cleared:
I’m not really sure I’m on the right path. In Ubuntu 14-16, if two DNS servers were specified in the settings, the system would query the second DNS if the first didn’t respond. Something similar is described here - systemd issue #5755. Maybe I’m mistaken, but that’s how it felt.
It’s also possible that connecting to the VPN used to clear the DNS cache, which no longer happens.
Here’s my current scenario: I have containers to which I assign hostnames in the configuration, and thanks to this wonderful project, I can refer to them by name instead of IP. I also have a VPN to the corporate network with many internal addresses (I’ve counted about 10 domains so far). Previously, all requests would go through the corporate DNS when connected to the VPN. This somewhat worked, but some addresses with external IPs were cached to the external address used for acme https, and I had to manually add them to /etc/hosts or the service’s admin panel.
A week ago, I tried to solve this problem, and now I have two Docker containers. The first is with this project, and the second with dnsmasq, configured as follows:
I also had to set ipv4.dns-search: "~." in /etc/netplan/vpn.yaml and ip4.dns: "172.17.0.1" (dnsmasq address), so that this DNS is applied when connecting to the VPN.
This scheme works more or less well. Only requests for the domains I specified go to the corporate DNS. This is also a drawback, as I need to maintain the list of these domains. Another drawback is that I can only access the Docker containers when the VPN is on.
Is there a simpler or better solution?
The NXDOMAIN response (no answer) will be cached for one hour.
Yes, the problem is indeed with the cache. When looking at /v1/caches/ (thanks, very useful), I can see two entries for the domain I need:
global: ttl": "PT20S" remote: "ttl": "PT1H" With the VPN disconnected, I run dig +noall +nocmd +answer @172.17.0.1 and get "nothing" because 8.8.8.8 does not have a record for this domain. Then I connect to the VPN, run the same command, and no matter how long I wait (more than 20 seconds), I do not get an IP or a TTL.
Here is what I see in the console during debugging:
I’m not really sure I’m on the right path. In Ubuntu 14-16, if two DNS servers were specified in the settings, the system would query the second DNS if the first didn’t respond. Something similar is described here - https://github.com/systemd/systemd/issues/5755. Maybe I’m mistaken, but that’s how it felt.
It looks like the same behavior, but the reason is another. I identified a new improvement to be done, I've done some definition for this new improvement at #455. It probably would fix your bad experience, what do you think?
and thanks to this wonderful project
Appreciate your thanks, you're welcome!
Is there a simpler or better solution?
Actually I consider we can fix that by #455 or/and by creating a toggle for the entire cache. Despite this, DPS on version 3.12.1 have not this cache feature, then you can test it and check if has the behavior you want (not sure this version is stable for use but maybe it can make the job for now), this way we can validate the wanted behavior before I implement these two solutions on the new version. The two solutions aren't too big though.
The cache was implemented at 3.13.1 see the release notes.
@rayout
Create a watch dog to keep testing the remote servers circuit, it will clear the cache whenever a remote server goes down or gets health again.
I think this is a great idea and it will really solve the problem.
@rayout 3.19.0 is out implementing the watchdog, can you check it solves your usecase? Also consider calibrate your circuit breaker params.
I'm closing this issue, feel free to reopen if the issue persists
Thank you! I have been testing for some time to wait for feedback, and then went on vacation. Everything works fine now. When turning VPN on and off, everything works instantly. But sometimes, if I have been without a VPN connection for a long time, when I turn on the VPN, any DNS requests stop working altogether until I completely restart the docker container with the DNS proxy. I will continue testing and will create a task if I catch this issue.
Thank you! I have been testing for some time to wait for feedback, and then went on vacation. Everything works fine now. When turning VPN on and off, everything works instantly.
Glad we had an advance!
But sometimes, if I have been without a VPN connection for a long time, when I turn on the VPN, any DNS requests stop working altogether until I completely restart the docker container with the DNS proxy. I will continue testing and will create a task if I catch this issue.
Sure, thanks for your contribution @rayout