A content delivery network (CDN) is a geographically distributed network of servers that work in tandem to provide rapid delivery of online content. A CDN makes it possible to quickly transfer quintessential internet content files, such as HTML pages, stylesheets, images, and most importantly, videos. The majority of web traffic, including traffic from big websites like Facebook, Amazon, and Netflix is served through CDNs.
How a CDN Works
CDNs are an important part of present-day internet infrastructure. Thus, it is useful for you to know what’s happening behind the scenes in a CDN network, especially if you are a website owner or a web enthusiast. Let’s now look into some of the technical details of how a content delivery network functions.
Now, when you click a link (URL) for a website you intend to surf/visit, your web browser makes a request for a resource. Therefore, the first thing it does is make a DNS request. To understand what a DNS request is, think of a phone book. When looking for a contact in a phonebook, you type in the first few letters and the phonebook will return with the contacts that match the characters you typed; similarly, a browser gives the domain name (e.g., scienceabc.com) and the server sends the respective IP address to the browser. Upon the procurement of this IP address, the browser can then contact the web server directly for any future requests.
Interestingly, it is Physics that determines how fast your computer can contact another over a physical connection. Thus, attempting to access a server in Singapore from a computer in the US will take longer than trying to access a server from within the US itself. In a bid to enhance user experience and reduce transmission costs, large organizations set up servers with copies of data in strategic geographic locations across the globe. This is called a content delivery network (CDN) and these servers are technically called edge servers, as they are nearest (on the edge) to the end-user on a company’s network. A CDN has data centers (called points of presence) that are situated around the world. Within each data center, there are thousands of servers. This arrangement helps in accelerating the speed at which content is delivered to the end user.
Now, if a browser makes a DNS request for a domain name with CDN services enabled, a slightly different process occurs than usual. The server responsible for handling DNS requests for the domain name reads the incoming request and determines the best set of servers to handle it. Simply put, the DNS server does a geographic lookup based on the DNS resolver’s IP address and then returns an edge server’s IP address that is physically closest to the area in which browser is located. So, if you are in Georgia and are making a DNS request, you are most likely going to be given an IP address for a server on the East coast of the US. Similarly, if you make the same request from California, you’ll probably be given an IP address for a server on the West coast.
Getting the request to the closest server possible is the first step of the process. However, bear in mind that companies often optimize their CDNs in other ways, such as redirecting to a server that is more economical to run or one that is sitting idle.
Accessing the Content from CDN
When you visit or enter a particular link (URL) of a website on a CDN network, your browser sends a request to the edge server for content. The edge server first refers to the cache to check if the content is present. If the content is present in the cache and the cache entry hasn’t crossed the expiry, the content is served directly from this edge server to the end user.
Now, if the content is not in the cache or the cache entry has reached expiry, then the edge server will make a request to the origin server to retrieve the content. The origin server is the original repository of content and is capable of serving all types of content available on the CDN. Once the origin server sends a response, the edge server stores the received content in the cache, depending upon the HTTP headers of the response.
It must be noted that an edge server is a proxy, and it is the origin server that tells the edge server exactly what content should be returned for a particular request. The origin server usually runs on modern-day web technologies like Java, Ruby, Node.js etc., so it can do anything it wants. The edge server does nothing but make requests and serve content to the end users.
The CDN is basically like a cache—it has a value until the time it can directly serve data and does not need to recurrently contact the origin server. If an edge server needs to make a request to the origin server for every request, the CDN loses its purpose and worth.
The Future of CDN
With the improved internet penetration and rising data bandwidth, people are consuming more video content for longer periods of time, and on more devices, making CDN more relevant every single day. Currently, Akamai (US), Cloudflare (US) Amazon Web Services (US), Internap (US), CDNetworks (Korea), and Tata Communications (India and Singapore) are industry leaders when it comes to CDN. According to a new business research, CDN market size is expected to grow with a CAGR of 32% for the next 2-3 years. In other words, a sound understanding of CDNs and their use will help website owners significantly improve performance benefits for their visitors.