NO_PURGE removing records on domain root it isn't managing
The situation or my intention is following: I'm partially taking care about a domain of a friend of mine, as he's using my web- and mailserver. As I'm managing everything DNS-related with dnscontrol, I'd like to manage his domain and its records for the use of both mentioned services as well. That's where the great NO_PURGE function comes to the party.
Just when implementing NO_PURGE into my config, I've seen following:
----- Getting nameservers from: registrar
----- DNS Provider: registrar...5 corrections
#1: DELETE TXT domain.tld "MS=msXXXXXXXXX" ttl=3600
#2: CREATE TXT 20201._domainkey.domain.tld ...
#3: CREATE TXT _dmarc.domain.tld ...
#4: MODIFY MX domain.tld: (10 mail.domain.tld. ttl=3600) -> (20 mail.domain.tld. ttl=3600)
#5: REFRESH zone domain.tld
----- Registrar: registrar...0 corrections
There are two things I'd like to mention:
- Changes 2-5 are intended and are configured in my dnscontrol configuration accordingly
- However: Change no. 1 is NOT intended. This is a domain record added from my friend outside dnscontrol configuration (via WebUI of the DNS registrar), which should not be removed nor touched from dnscontrol.
What I've found out: This only occurs when updating records on the main domain itself, just like the MX record which should be modified with update 4. While I understand the technical aspect/reason behind this occurring, I might imagine it might be unhandy for scenarios like mine.
As the main purpose of NO_PURGE is not deleting records it's not managing, I would consider this behavior as a "bug" or "not expected"?
So apparently this is the code responsible for the NO_PURGE functionality:
https://github.com/StackExchange/dnscontrol/blob/541bb805da5baed677fa471cac6d41350f60c877/pkg/diff/diff.go#L111-L119
So on each iteration, k looks like:
k = {github.com/StackExchange/dnscontrol/v3/models.RecordKey}
NameFQDN = {string} "domain.tld"
Type = {string} "TXT"
It seems that the whole NO_PURGE logic is on a quite high FQDN-level. So:
- When the DNS zone does have TXT record set, which is not managed through dnscontrol
- And you try to set a second TXT record on the same domain.tld level, dnscontrol thinks both are the same and is either deleting or overwriting the existing TXT record.
For example following record is set as non-dnscontrol-managed:
@ 8600 IN TXT "VERF=AWESOME_VERIFICATION_THINGY"
And you try to add through dnscontrol:
TXT('v=spf1 include:mx.domain.tld -all', '')
You will end up with following modification on the next run:
#1: MODIFY TXT domain.tld: ("VERF=AWESOME_VERIFICATION_THINGY" ttl=43200) -> ("v=spf1 include:mx.domain.tld -all" ttl=43200)
But both are completely different use cases.
As many records of the same type could exist, but some managed through hoster and some through dnscontrol, this granularity is - I think - not really optimal. Probably the only unaffected DNS record type is CNAME in this situation.
To differentiate between records of the same type but different values, probably it might be a good idea/possible to add the actual value for more granular comparison?
I've never used NO_PURGE to ignore changes to the apex domain (the main domain). So, yes, I can imagine that being a bug. As a workaround, what happens if you add IGNORE('@') to the domain?
what happens if you add
IGNORE('@')to the domain?
This: https://github.com/StackExchange/dnscontrol/issues/799 :D
Yeah, I saw that 1 minute later! :D
I thought so when I saw your other comment. But couldn't resist still commenting :D
A possible solution I thought of might be adding value to models.RecordKey and taking it into account when comparing. However not sure if this might break something else.
@patschi
Good news! The "diff2" changes completely rewrote the NO_PURGE code. Can you test to see if this is still a problem?
I think (or hope!) that the new code handles this edge-case.
I'm going to assume that diff2 solved this problem. Please re-open this issue if not. Thanks!