Domain Name System: The important parts


What is a domain name?

According to RFC 1034, which lays out the original DNS specification, the domain name of a node is a list of labels on the path from the node to the root of the tree. 

Okay, let's try to phrase it in a more comprehensible way. A domain name describes a node on the tree by tracing the node all the way back to the root. Every label along that path becomes a part of the domain name. The takeaway here is that we can describe a node by its domain name. For the sake of brevity, in this article, a domain name is also referred to as the "domain" from now onwards. 

Now the root of the tree is represented by a dot. For example, iamshouvikmitra.blogspot.com is represented as iamshouvikmitra.blogspot.com. 

 [.]
  |_ [com]
        |_ [blogspot]
                |_ [iamshouvikmitra]

Therefore, the domain for "com" is "com.", the domain from "blogspot" is "blogspot.com.", and finally the domain for node "iamshouvikmitra" is "iamshouvikmitra.blogspot.com."

Now, you might think that we have never seen a domain name ending with the trailing dot, and to be honest, we rarely see them, and it's perfectly fine to leave it out, but we should always remember even if we do not see it, it is still there.   

In the context of Route53, it makes no distinction so we can keep the trailing dot or not, it is perfectly up to us.  Now, the entire tree structure that we just saw is called the domain namespace. All node connected to the root node is a part of the domain namespace. At first look, you may think that it is pretty manageable to store all domains in a single server. But, imagine this tree to grow to trillions of nodes! Storing that in a single server might just not be feasible right?

To put it concisely, DNS needs to be scalable and cannot be maintained or managed by a single entity. So how do we scale such a system?

So how do we scale the DNS?

At a glance, the DNS can be thought of as a database for storing records. But, it is just not a simple database. It is scalable in two respects. 
  1. Needs to scale to adapt to an increasing and ever-changing set of nodes. 
  2. Needs to scale without giving the control to a single person or entity.
The DNS accomplishes these requirements by dividing up the namespace. A portion of the namespace is delegated to different entities so that they can modify, and or delete only their portion of the namespace. Each stated entity is now responsible for providing nameservers to hold the delegated portion of the namespace. The overall picture is that the entire namespace is now distributed across different name servers such that no single server has to store the entire tree, as each server now stores only a subset of the tree. 

Let's try to understand this with the aforementioned domain. 

NS1            [.]
NS2             |_ [com]
NS3                   |_ [blogspot]
                                   |_ [iamshouvikmitra]
         
 In this example, we store the root node domain name in nameserver NS1. Here, the root node is represented as the dot(.) 
The "com" domain is stored on NS2, and both "blogspot" and "iamshouvikmitra" are store in NS3. 

In DNS terms, each of these pieces is called a zone. The way to identify a zone is by the domain name closest to the root, which we called the origin/apex of the zone. For instance, we can say NS3 holds the zone named "blogspot.com" because the domain "blogspot.com" is the closest to the root. Thus, "blogspot.com" is the origin/apex as it is where the zone begins.

Now you might wonder how is the zone information stored in each nameserver.  You can take a closer look at the ZFS(zone file system) for more details. Learn more about ZFS

The nameserver is authoritative. It is the author of a zone and contains an origin domain. The nameserver stores all the information for its zone.  

Now, let's see how to store a resource record domain name "www.blogspot.com", which will instruct clients to resolve to an IP address "172.217.14.233". Now since the domain, "www.blogspot.com" falls within the "blogspot.com" zone and NS3 is authoritative for that zone, we will create this record on NS3. 
The record will be represented as 

www.blogspot.com    A    172.217.14.233

Here the first part of the record represents the domain name, followed by the type of resource we are store("A") which stands for "address / IP address", which is then followed by the IP address itself, which is 172.217.14.233. Once the record is store in the nameserver any client can query for the same. In the next section, let's see how that works.

Querying the DNS

Now imagine a client(a software program that fetches information from the DNS by querying one or more nameservers) who has no knowledge of the namespace, and wants to find the IP address of www.blogspot.com. The client initially would not have any idea that NS3 has any information about the "www.blogspot.com". So, it starts with the root. The root name server NS1 contains pointers to the nameservers that hold its child zones, which in this case is "com". Thus NS1 contains a nameserver resource record that says the "com" zone lives on NS2 as shown below

-- NS1 Records ----------------
ORIGIN     .
com        NS    ns2

Now the client moves to NS2, which again holds information about the "com" zone and also contains a nameserver record for blogspot which points to ns3

-- NS2 Records ----------------
ORIGIN         com.
blogspot       NS    ns3

Now the client moves to NS3, which has the information about "www.blogspot.com"

-- NS3 Records ----------------
ORIGIN             blogspot.com.
www.blogspot.com   A    172.217.14.233

TIP: DNS is case-insensitive

 The process is recursive in nature. The client keeps following the trail of nameservers until it finds the resource record it is looking for or it gets a response indicating a record doesn't exist. Now you might think, how does a client know that it has to start with NS1?

Well, the truth is it has to be a piece of common knowledge across all clients. This also indicates that whoever controls the root name server will have a lot of power over the entire domain name system.

Do let me know in the comments section below if you want to know how this DNS is actually built and implemented on the internet so that everyone is able to use it. If I recieve a considerable amount request, I would love to write more about that.

Comments

Popular posts from this blog

Web Development - Let's get started!

MCD Model of Leadership

Easy Speed Optimization for Web Developers