DNS servers should have NS and SOA records #8047

iximeow · 2025-04-24T21:36:57Z

i've marked this "ready", but i'm not merging this as-is - the progenitor patch here is not suitable and i'm going to go bump progenitor like a normal person first, and maybe even not have to touch progenitor versions in this change at all. that might bring the delta back down to below +2000...

aside from Progenitor, this is definitely ready for eyes!

this is probably the more exciting part of the issues outlined in #6944. the changes here get us to the point that for both internal and external DNS, we have:

A/AAAA records for the DNS servers in the internal/external group (named ns1.<zone>, ns2.<zone>, ...)
NS records for those servers at the zone apex, one for each of the ns*.<zone> described above
an SOA record for the zone apex for each of oxide.internal (for internal DNS) and $delegated_domain (for external DNS)
the SOA's serial is updated whenever the zone is changed. serial numbers are the DNS config generation, so they start from 1 and tick upward with each change. this is different from most SOA serial schemes (in particular the ones that would use YYYYMMDDNN numbering schemes) but so far as i can tell this is consistent with RFC 1035 requirements.

we do not support zone transfers here. i believe the SOA record here would be reasonable to guide zone transfers if we did, but obviously that's not something i've tested.

SOA fields

the SOA record's RNAME is hardcoded to admin@<zone_name>. this is out of expediency to provide something, but it's probably wrong most of the time. there's no way to get an MX record installed for <zone_name> in the rack's external DNS servers, so barring DNS hijinks in the deployed environment, this will be a dead address. problems here are:

we would want to take in an administrative email at rack setup time, so that would be minor plumbing
more importantly, what to backfill this with for deployed systems?

it seems like the best answer here is to allow configuration of the rack's delegated domain and zone after initial setup, and being able to update an administrative email would fit in pretty naturally there. but we don't have that right now, so admin@ it is. configuration of external DNS is probably more important in the context of zone transfers and permitting a list of remote addresses to whom we're willing to permit zone transfers. so it feels like this is in the API's future at some point.

bonus

one minorly interesting observation along the way is that external DNS servers in particular are reachable at a few addresses - whichever public address they get in the rack's internal address range, and whichever address they get in the external address range. the public address is what's used for A/AAAA records. so, if you're looking around from inside a DNS zone you can get odd-looking answers like:

# 172.30.1.5 is the internal address that an external DNS server is bound to.
# oxide.test is the delegated domain for this local Omicron deployment.
root@oxz_external_dns_68c5e255:~# dig +short ns2.oxide.test @172.30.1.5
192.168.0.161
root@oxz_external_dns_68c5e255:~# dig +short soa oxide.test @172.30.1.5
ns1.oxide.test. admin.oxide.test. 2 3600 600 18000 150
root@oxz_external_dns_68c5e255:~# dig +short ns oxide.test @172.30.1.5
ns1.oxide.test.
ns2.oxide.test.
# 192.168.0.160 is an external address for this same server.
# there are no records referencing 172.30.1.5 here.
root@oxz_external_dns_68c5e255:~# dig +short ns oxide.test @192.168.0.160
ns1.oxide.test.
ns2.oxide.test.
root@oxz_external_dns_68c5e255:~# dig +short ns1.oxide.test @192.168.0.160
192.168.0.160

plus rustfmt, clippy

iximeow · 2025-04-30T19:47:41Z

Cargo.toml

+[patch.crates-io]
+progenitor = { git = "https://github.com/oxidecomputer/progenitor", rev = "e4af3302c20e35dff6ceafc61e0175739922c132" }
+progenitor-client = { git = "https://github.com/oxidecomputer/progenitor", rev = "e4af3302c20e35dff6ceafc61e0175739922c132" }


i'm going to go plumb the Progenitor bump separately, i don't want to make Cargo.toml changes here (moving DNS servers to versioned APIs is noisy enough!)

please disregard Cargo.toml changes from your eyes and i'll promise to not merge this until this (and Cargo.lock) are unchanged 😇

iximeow · 2025-05-01T21:43:51Z

internal-dns/types/src/v2/config.rs

+    /// Perform a *lossy* conversion from the V2 [`DnsConfig`] to the V1
+    /// [`v1::config::DnsConfig`].  In particular, V2 adds NS and SOA records,
+    /// which did not exist in V1, so they are silently discarded when
+    /// converting down.
+    ///
+    /// If this conversion would leave an empty zone, the zone is omitted
+    /// entirely.
+    pub fn as_v1(self) -> v1::config::DnsConfig {


so as it turns out, there is one place where we do a get records -> update -> put records pattern, and that is dnsadm. that means that using a dnsadm that uses the V1 API, currently, could talk to a V2 DNS server, get some records, experience this lossy conversion, add a new record, and then put a new set of mangled records back to the server.

in practice i don't think this is an issue: the only records that would be lost are NS and SOA, which are only on the zone apex. since those are also the only records at the apex, any time a V1 client gets records from a V2 server the @ records would be omitted (since the lossy conversion would filter all records), so for this cross-version example to actually modify records the V1 client doesn't know about you'd have to add an A or AAAA or SRV record to @.

in reality, i think that this use of dnsadm is very rare. i've ~never seen it come up! i've used it but not to alter records. so the above seems like low risk, and why i didn't add support for NS and SOA record management to it either.

this is the biggest wrinkle of the DNS API version bump and if there's objections to my approach here, please shout!

I'm not sure how hard this would be to add but would it make sense to disallow a PUT with v1 when any v2 features have been used already (i.e., there are SOA or NS records)? I wonder if we even want to disallow GETs. The main reason to support both versions is for the intermediate state but once anything has started using v2 we don't expect to go backwards.

I don't think this is a big deal because this is mainly a development tool and it probably hasn't been used in ages anyway.

iximeow · 2025-05-01T21:47:33Z

internal-dns/types/src/v2/config.rs

+impl From<Srv> for DnsRecord {
+    fn from(srv: Srv) -> Self {
+        DnsRecord::Srv(srv)
+    }
+}
+
+#[derive(
+    Clone,
+    Debug,
+    Serialize,
+    Deserialize,
+    JsonSchema,
+    PartialEq,
+    Eq,
+    PartialOrd,
+    Ord,
+)]
+pub struct Srv {
+    pub prio: u16,
+    pub weight: u16,
+    pub port: u16,
+    pub target: String,
+}
+
+impl From<v1::config::Srv> for Srv {
+    fn from(other: v1::config::Srv) -> Self {
+        Srv {
+            prio: other.prio,
+            weight: other.weight,
+            port: other.port,
+            target: other.target,
+        }
+    }
+}


the other option here is to use the v1::config::Srv type directly in v2, because it really has not changed. weaving the V1/V2 types together seems more difficult to think about generally, but i'm very open to the duplication being more confusing if folks feel that way.

I would probably use the v1 types directly but I can see going either way.

workspace-hack/Cargo.toml

iximeow · 2025-05-01T21:55:18Z

openapi/wicketd.json

@@ -3582,7 +3582,7 @@
        ]
      },
      "RotImageError": {
-        "description": "RotImageError\n\n<details><summary>JSON schema</summary>\n\n```json { \"type\": \"string\", \"enum\": [ \"unchecked\", \"first_page_erased\", \"partially_programmed\", \"invalid_length\", \"header_not_programmed\", \"bootloader_too_small\", \"bad_magic\", \"header_image_size\", \"unaligned_length\", \"unsupported_type\", \"reset_vector_not_thumb2\", \"reset_vector\", \"signature\" ] } ``` </details>",


i'm pretty confused about the backticks showing up in this file!

davepacheco

I haven't had a chance to look at this closely.

dns-server/src/http_server.rs

davepacheco · 2025-05-02T19:35:29Z

internal-dns/types/src/v2/config.rs

+    /// Perform a *lossy* conversion from the V2 [`DnsConfig`] to the V1
+    /// [`v1::config::DnsConfig`].  In particular, V2 adds NS and SOA records,
+    /// which did not exist in V1, so they are silently discarded when
+    /// converting down.
+    ///
+    /// If this conversion would leave an empty zone, the zone is omitted
+    /// entirely.
+    pub fn as_v1(self) -> v1::config::DnsConfig {


I'm not sure how hard this would be to add but would it make sense to disallow a PUT with v1 when any v2 features have been used already (i.e., there are SOA or NS records)? I wonder if we even want to disallow GETs. The main reason to support both versions is for the intermediate state but once anything has started using v2 we don't expect to go backwards.

I don't think this is a big deal because this is mainly a development tool and it probably hasn't been used in ages anyway.

davepacheco · 2025-05-02T21:47:04Z

internal-dns/types/src/v2/config.rs

+impl From<Srv> for DnsRecord {
+    fn from(srv: Srv) -> Self {
+        DnsRecord::Srv(srv)
+    }
+}
+
+#[derive(
+    Clone,
+    Debug,
+    Serialize,
+    Deserialize,
+    JsonSchema,
+    PartialEq,
+    Eq,
+    PartialOrd,
+    Ord,
+)]
+pub struct Srv {
+    pub prio: u16,
+    pub weight: u16,
+    pub port: u16,
+    pub target: String,
+}
+
+impl From<v1::config::Srv> for Srv {
+    fn from(other: v1::config::Srv) -> Self {
+        Srv {
+            prio: other.prio,
+            weight: other.weight,
+            port: other.port,
+            target: other.target,
+        }
+    }
+}


I would probably use the v1 types directly but I can see going either way.

nexus/db-model/src/dns.rs

nexus/reconfigurator/execution/src/dns.rs

Co-authored-by: David Pacheco <[email protected]>

iximeow · 2025-05-05T22:31:27Z

dns-server/src/dns_server.rs

@@ -284,30 +335,38 @@ async fn handle_dns_message(
                (RecordType::A, DnsRecord::A(_)) => true,
                (RecordType::AAAA, DnsRecord::Aaaa(_)) => true,
                (RecordType::SRV, DnsRecord::Srv(_)) => true,
+                (RecordType::NS, DnsRecord::Ns(_)) => true,
+                (RecordType::SOA, DnsRecord::Soa(_)) => true,


one interesting detail here: RFC 1034 describes negative caching as returning an SOA in an authoritative name error, where the minimum field of the SOA is used as the TTL for the negative result. that's probably one of the more common ways you'd run into an SOA record if you're not operating nameservers yourself. this only returns an SOA record if you've queried about an SOA record.

in the negative caching case we'd want to return this SOA record in the soa* list provided to MessageResponseBuilder. somewhat different kind of plumbing, but it'd be relatively straightforward after this.

* technically RFC 1034 says "may add an SOA RR to the additional [...]", but it's wrong. this is clarified in 2181 7.1.

davepacheco · 2025-05-06T17:56:42Z

dns-server/src/storage.rs

+        // authoritative for zones. SOA records are in the API types here
+        // because they are included when reporting this server's records (such
+        // as via `dns_config_get()`).


Does this mean you cannot GET the config and PUT it right back? That doesn't seem right. (edit: more below.)

davepacheco · 2025-05-06T17:58:23Z

dns-server/src/storage.rs

+                    //
+                    // Assuming one generation bump every minute, this overflow
+                    // would affect operations after 8,171 years.
+                    let soa_serial = config.generation.as_u64() as u32;


Great comment, but why not use a try_from anyway and produce a 500 if this happens?

once the generation goes above u32::MAX we would always 500 on any DNS update (until the generation itself wraps u64). there's nothing that would reset the DNS generation after initial rack setup, right? so even attempting to update DNS records as part of an update would fail - it'd be pretty difficult for a customer to get out of that situation.

I read the comment to be arguing "this is never going to happen in practice", which I agree with. If somehow it did happen, I'd rather it fail explicitly and in a way that's debuggable (e.g., an explicit error in a log somewhere) rather than implicitly and non-fatally (e.g., serial number goes backwards; downstream clients fail to see any subsequent changes).

so.. i'd think it's very unlikely to happen in practice, but not impossible - if there was haywire automation against a rack that created and destroyed silos in a loop, the threshold here is more like.. "happens in 25 years"? i don't think there's an operation that would increment the generation number in under ~200ms, so (mostly joking) it almost makes sense to return a 500 if the year is lower than 2045 and this condition occurs, roll over otherwise 🙃

i otherwise agree that failing explicitly would be better to discover and debug. logging every time the rollover occurs (e.g. config.generation.as_u64() > 0 && config.generation.as_u64() as u32 == 0) would be a little better, what do you think? am i weighting the haywire automation risk too high?

I'm afraid that a log message would be very easy to miss. What I'm thinking with the 500 is: someone will notice immediately when this happens and they can follow the failing request directly to the problem.

That said: maybe the better thing here is to change the OpenAPI spec so that the generation is a u32. Most importantly:

we'd validate that it's in range during PUT/GET

code using the Progenitor client couldn't possibly set a value out of range

This should force the range problem to be dealt with much earlier, at the point where we try to bump the DNS generation number in the first place. That code already has to deal with a range error (whether it does or not, I don't know) -- we're just shrinking that range. That'll be in Nexus, too, so we'll be in a better position to surface that as a fault, once we have richer support for fault reporting.

We could do this in a separate PR. That might also need to change the column of the dns_version table -- I'm not sure. Normally I'd be worried about this sort of change breaking deployed systems, but I really don't think any deployed systems have values large enough for this to be a problem and I think we could just make this change without worrying about it.

You also don't need to block this PR on fixing that. For example, you could do whichever of these in this PR and fix the API in a follow-on PR.

If you do go with rollover, though, beware that the DNS server might not see every generation number so if you just check for generation as u32 == 0, you would miss some rollovers.

Nexus is definitely where i'd prefer handling the generation number overflowing too, i like that thought a lot. i'll go with making this a 500 in the DNS server in the mean time. that error path will either go away or move to Nexus in the follow up so it wouldn't be a 30-to-8,171-year timebomb in the same way.

davepacheco · 2025-05-06T18:00:36Z

dns-server/src/storage.rs

+                    // Assuming one generation bump every minute, this overflow
+                    // would affect operations after 8,171 years.
+                    let soa_serial = config.generation.as_u64() as u32;
+                    apex_records.push(DnsRecord::Soa(


Ah, I see -- so, why store this at all in the database? What about synthesizing it when we load the config from the database? That will avoid the problem I mentioned above (round-trip GET + PUT doesn't work) and could also future-proof us if we decide to change anything about this.

when you say "store this at all in the database", you mean, why store the SOA record on disk at all? i do like that on the basis of avoiding the "round-trip doesn't work" problem you noted. though in the same way i don't love the SOA record not existing in CRDB, i don't love the SOA record being entirely synthetic. it's invisible to omdb, it'd be invisible to dnsadm. if support bundles included DNS records, SOA would still be absent!

maybe the thing to do here is create an SOA if one isn't present, and if one is present just replace the serial with config.generation. that corrects the round-trip-doesn't-work issue and means we can debug with more custom SOA records if server's defaults are no good for some environment.

The more I get into the PR, the more strongly I feel like we shouldn't store the SOA record into the DNS database (or, at least, shouldn't expose it back out the API).

though in the same way i don't love the SOA record not existing in CRDB, i don't love the SOA record being entirely synthetic. it's invisible to omdb, it'd be invisible to dnsadm. if support bundles included DNS records, SOA would still be absent!

All things being equal, I agree. But there's no reason omdb and support bundles can't also report the entire DNS contents (fetched via DNS queries) -- that'd be a better representation of what clients see anyway. So I don't think that should be a constraint on the solution here.

maybe the thing to do here is create an SOA if one isn't present, and if one is present just replace the serial with config.generation. that corrects the round-trip-doesn't-work issue and means we can debug with more custom SOA records if server's defaults are no good for some environment.

This has the opposite round-trip problem: Nexus might PUT configuration C1 with generation number g1, but if it GETs it back, it will find a different configuration C2 with generation number g1. That's a problem on its own (it violates the constraint that if the DnsConfig changes, the generation number must change). (And the DNS server can't bump g1 because it doesn't own that -- that would fork the linear history.) This also introduces a discrepancy between what's in CockroachDB and what's in the DNS config. The database wouldn't have the SOA record, but the actual config would. Worse, if Nexus then wants to create configuration C3 (immediately following C1, as far as it's concerned) with generation number g2, how would the DNS server interpret receiving a request that, relative to the previous generation, is missing the SOA? Is Nexus trying to remove it altogether or did it just not know about it?

Stepping back (sorry I'm sure this restates stuff you know but it's been helpful for me to write this down), the way I'm thinking about this is: right now, there are 1-1 relationships between all of these:

the contents of the DNS-related database tables

the contents written to the DNS PUT API

the contents returned by the DNS GET API

the contents of the DNS server's database

the DNS records served by the DNS server

(Right? It's been a while since I've been in this code.)

and there are two other core design principles:

the abstraction provided by the DNS server is that it's configurable with a monotonically-increasing generation number -- it serves exactly what you configure it with; and

changes to the DNS config always originate in Nexus, flow to the database, then the DNS servers (via the PUT API), which update their local storage and the external API

All of this keeps things really simple: Nexus is always free to make any changes it likes, and all these flows are pretty trivial because they never do any conflict resolution, they never silently change any values, or ignore any values, etc -- they just take the data they're given and write to the next thing in the flow. This is important because in the end we have a lot of copies of this data across a few different representations, so if any layer is transforming it (semantically), it's a lot harder to reason about and be sure it's all working right.

It's problematic for any point in this flow to create new DNS data that gets exposed to earlier steps in the flow. You can sort of pick your poison but one way or another this gets really messy. We've seen that with the GET/PUT round-trip problem, the different PUT/GET round-trip problem, the problem where the DNS server can't distinguish between an intentional change made by Nexus vs. a case where Nexus just didn't know about some data that it had made up, etc.

I'd claim that it's fine though for any point in this process to make up DNS data that it doesn't expose to earlier in the process. That doesn't introduce any of these problems.

This might be confusing but I imagine us getting to this design like this:

we start by saying all of the config is specified by the DNS API client

then we say: we want to add SOA records, but we want the serial number to be filled in implicitly by the generation number because otherwise we're creating a huge pain for clients to try to manage this on their own

so you could imagine a design where the client does specify an SOA record, but it just doesn't have a serial_number field -- that's implicitly filled in

This is what I imagine we would do if we were fully fleshing this out, namely if we ever wanted the contents of the SOA to be configurable by Nexus.

But since we always want exactly one SOA and don't need its contents to be configurable, we didn't bother doing the work to plumb it into CockroachDB, the DNS API, or the DNS storage layer.

If that's too weird, I'd almost rather do the work to put the SOA sans serial number into all those places and have Nexus "fully manage it" (which just means making sure there's always exactly one). I think that'd be a sound approach, just more work than we need to do right now.

The last thing I wanted to mention is future-proofing. Right now, we don't allow the SOA contents to be configured at all and that's fine. If it's completely synthetic, then in the future, we could still allow parts of it to be configured by the API client. If not specified by the API client, we still know how to make up synthetic values. At that point, though, if we're also storing our made-up version in the database, then we won't be able to distinguish an SOA configured by the client from one synthesized by the server. That kind of sucks if we decide later to, say, change the synthesized contents. This might all sound contrived but all I mean to say is that by storing this into the DNS database, we're losing the ability to distinguish "this was written by a client who cares about these values" vs. "the client doesn't care and we should use our defaults [which could change over time]".

This has the opposite round-trip problem: ...

yeah, that's a lot thornier than i'd initially considered. and

It's problematic for any point in this flow to create new DNS data that gets exposed to earlier steps in the flow

i hadn't really thought about it as a unidirectional flow, but i see the reasoning. so, i'm convinced that the SOA record really should not be returned through the DNS server's GET.

it probably doesn't make sense to synthesize a SOA record when reading the config, because GETing the config also reads the config in the same way, so we'd have to filter the SOA back out there. i think it works out well as a case in query_raw, where if you query for records at the apex we'll create an additional SOA record on top of whatever's defined in the database.

This might all sound contrived

less than you might think :) i've made basically these same arguments before, i had just assumed the only values we might want to configure here really are the administrative contact at some point in the future.

davepacheco · 2025-05-06T18:07:20Z

nexus/reconfigurator/execution/src/dns.rs

+        // We'll only have external DNS nameserver records - the A/AAAA records
+        // for servers themselves, and NS records at the apex.
+        let baseline_external_dns_names = external_dns_count + 1;
+        assert_eq!(
+            external_dns_zone.records.len(),
+            baseline_external_dns_names
+        );


If I'm following this right, this is asserting that there are 4 external DNS zone records. I don't follow why that's right, though -- don't we have an A record and NS record for each of the 3 servers, so 6 records altogether?

Edit: Oh, is records.len() the number of different names that have any records at all? So it's: the apex (which has the 3 NS records all at one name) + each of the three servers (each with a different name)? If that's it maybe the comment could be a little more explicit. "Although there will be 2 records for each external DNS server, all the NS records will be under one name, so there are only N + 1 entries in the records map."

i was pretty confused at first here too, for the same reason. i'd tried writing exactly that assert_eq!(...len(), 6) even! i'll change .records to .names in the v2 type and i think that will make it more legible.

davepacheco · 2025-05-06T18:12:31Z

nexus/src/app/rack.rs

@@ -206,20 +206,38 @@ impl super::Nexus {
        );

        let silo_name = &request.recovery_silo.silo_name;
-        let dns_records = request
+        // Records that should be present at the rack-internal zone apex -


It's important to know that this code path only affects the contents of the first internal DNS zone for newly-initialized racks. None of this will run for deployed racks. I assume that's fine?

(We probably should have an end-to-end test that the internal DNS contents doesn't change after the initial blueprint is executed. We have a similar end-to-end test that generating a new blueprint from the initial one makes no changes.)

Relatedly: should we be using blueprint_internal_dns_config() like you're doing with the external zone below?

None of this will run for deployed racks. I assume that's fine?

yeah, the changes in rack.rs are really just in service of minimizing the delta between state just after RSS and what the initial blueprint describes. except that as you noted below, the records don't actually get added to the internal update, which is a bug.

Relatedly: should we be using blueprint_internal_dns_config() like you're doing with the external zone below?

probably! i didn't make that change here because the initial internal DNS records are created by sled-agent, and i'd want to either compare those more closely. or, have an end-to-end test like you mention that checks that the records as created by sled-agent are the same as the initial blueprint.

davepacheco · 2025-05-06T18:21:26Z

nexus/types/src/deployment/execution/dns.rs

@@ -164,16 +170,36 @@ pub fn blueprint_external_dns_config<'a>(
    external_dns_zone_name: String,
 ) -> DnsConfigZone {
    let nexus_external_ips = blueprint_nexus_external_ips(blueprint);
+    let dns_external_ips = blueprint_external_dns_resolver_ips(blueprint);


(can't put this comment where I want it, which is blueprint_internal_dns_config)

Why don't we have to make a similar change to blueprint_internal_dns_config()?

Relatedly: if we don't already, we should make sure we have tests that the NS records look correct even after we perform reconfigurator activities (e.g., expunge and create a new internal or external DNS zone). This might be easiest to do as a reconfigurator-cli test, where you can just write some reconfigurator-cli commands to do that and verify the expectorate output.

Why don't we have to make a similar change to blueprint_internal_dns_config()?

well, we do. when i'd set this up on my workstation to see it all work, i'd missed this because i'd had the internal DNS update plumbed correctly. in the test you describe, internal NS records would not get updated to track actual internal DNS zones. i'll take a look at the reconfigurator-cli tests and do something here.

davepacheco · 2025-05-06T18:23:36Z

nexus/types/src/deployment/execution/dns.rs

        .into_iter()
        .map(|addr| match addr {
            IpAddr::V4(addr) => DnsRecord::A(addr),
            IpAddr::V6(addr) => DnsRecord::Aaaa(addr),
        })
        .collect();

-    let records = silos
+    let mut zone_records: Vec<DnsRecord> = Vec::new();
+    let external_dns_records: Vec<(String, Vec<DnsRecord>)> = dns_external_ips


Should we sort these for consistency?

we don't need to, so i'm not sure we'd want to. it seems a bit awkward that lower-numbered ns{} records would correlate with lower-number IP addresses, and that would mean that SOA records are biased towards the DNS server with the lowest IP.

maybe that's fine though?

The only thing I'm trying to avoid is having the contents of NS records change because of some non-determinism here, causing either blueprint or DNS generation bumps under normal conditions even when there's been no changes. I think that's important to avoid but any deterministic order seems fine.

davepacheco · 2025-05-06T18:25:03Z

nexus/types/src/deployment/execution/utils.rs

+
+/// Return the addresses on which this blueprint's external DNS servers listen
+/// for DNS queries.
+pub fn blueprint_external_dns_resolver_ips(


Suggested change

pub fn blueprint_external_dns_resolver_ips(

pub fn blueprint_external_dns_nameserver_ips(

(a lot of stuff uses these interchangeably but I believe resolvers are client-side components that initiate DNS queries and I find it useful to keep this distinction)

you're totally right, yes. i'd just picked a poor name here, i know better :(

lol, no sweat! Honestly, I see these used interchangeably more often than I see them used distinctly.

davepacheco · 2025-05-06T18:32:08Z

internal-dns/types/src/names.rs

@@ -20,6 +20,17 @@ pub const DNS_ZONE: &str = "control-plane.oxide.internal";
 /// development
 pub const DNS_ZONE_EXTERNAL_TESTING: &str = "oxide-dev.test";

+/// Label for records associated with a zone itself, rather than any names


Is this only used as a key into the records map?

davepacheco · 2025-05-06T18:33:25Z

internal-dns/types/src/v2/config.rs

+}
+
+impl DnsConfigZone {
+    fn as_v1(self) -> v1::config::DnsConfigZone {


I think we discussed elsewhere... I do wonder if we should fail this request if we have any NS records. We definitely don't want somebody to fetch this, make some other change, then PUT it back without the NS records and not realize they deleted them.

iximeow added 10 commits April 24, 2025 18:14

should be the needful.. think the dns server needs soa testing too

a5e0264

ok why does the test fail though

084553c

looking basically reasonable now i think

5c8cded

at least one omdb test needs updating..

2a5b875

puzzling that the nameserver IPs are like that though

a6150a4

actually answer NS and SOA queries, one more todo

182f98e

actually unwind local ipcc overrides

086d996

deny DNS updates that define SOA records

af76a85

plus rustfmt, clippy

no magic @ string

f77c4aa

oops, goofy test misses

9ea1427

iximeow added the release notes reminder to include this in the release notes label Apr 24, 2025

openapi happy, clippy happy, maybe its ok now

19907ac

iximeow force-pushed the ixi/dns-ns-and-soa branch 2 times, most recently from 842455b to f349290 Compare April 25, 2025 21:50

bump progenitor, this works but obviously incorrect approach

fa47ab1

iximeow force-pushed the ixi/dns-ns-and-soa branch from f349290 to fa47ab1 Compare April 25, 2025 22:08

iximeow added 3 commits April 28, 2025 17:11

test should clean up when it passes, rustfmt

aa3ecae

shuffle dns types and versions around to more suitable places

69cd5d0

clean up BOTH new tests actually

0f17ae6

iximeow mentioned this pull request Apr 28, 2025

Probably should not return ServFail for unknown zones #8061

Open

iximeow commented Apr 30, 2025

View reviewed changes

iximeow added 4 commits April 30, 2025 20:38

dont just duplicate v1 and v2 DNS API impls

bfb25a5

make the into impls a little more reasonable

33ee507

add a test exercising DNS server/client version combinations

9b8c813

move the lossy v2->v1 dns types conversions to not *Into

d399431

iximeow commented May 1, 2025

View reviewed changes

workspace-hack/Cargo.toml Outdated Show resolved Hide resolved

iximeow commented May 1, 2025

View reviewed changes

iximeow marked this pull request as ready for review May 1, 2025 21:58

iximeow added 2 commits May 2, 2025 18:28

why was that there

276356e

update Crucible and bump Progenitor to go with it

069d2cd

davepacheco reviewed May 2, 2025

View reviewed changes

iximeow and others added 2 commits May 2, 2025 15:05

Update nexus/db-model/src/dns.rs

8a9747d

Co-authored-by: David Pacheco <[email protected]>

review comments from Dave, ty

69f1506

iximeow commented May 5, 2025

View reviewed changes

davepacheco reviewed May 6, 2025

View reviewed changes

	pub fn blueprint_external_dns_resolver_ips(
	pub fn blueprint_external_dns_nameserver_ips(

DNS servers should have NS and SOA records #8047

Are you sure you want to change the base?

DNS servers should have NS and SOA records #8047

Conversation

iximeow commented Apr 24, 2025 • edited Loading

SOA fields

bonus

iximeow Apr 30, 2025 • edited Loading

Choose a reason for hiding this comment

iximeow May 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davepacheco left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iximeow May 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iximeow commented Apr 24, 2025 •

edited

Loading

iximeow Apr 30, 2025 •

edited

Loading

iximeow May 1, 2025 •

edited

Loading

iximeow May 5, 2025 •

edited

Loading