What happens when you search for a URL in your browser?

Francisco Guzmán Herrera
10 min readJan 9, 2021

--

In this post we will see in detail what happens after pressing enter when entering a URL in a browser.

Before we start, let’s quickly look at what a URL is made of.

The initials HTTPS at the beginning refer to the protocol that your computer will use to connect. HTTP stands for Hypertext Transfer Protocol and the S stands for Secure, HTTP and HTTPS are the same protocol, only HTTPS is the one with the padlock there like the one on this page, which means that all the information that travels from your computer to the server and vice versa is encrypted in a few words your information is safe and you will not be a victim of ugly hackers.

then follows the subdomain, which is basically a sub-classification that serves to organize different sections of a website, for example the subdomain info.medium.com will only provide information about the site while the subdomain www.medium.com shows the home page.

and then follow the domain and its extension, which is the unique name of a website.

an extension tells us what type the website is or its geographic location, for example:

types of websites:

  • .com is for commercial or content websites.
  • .org is for non-profit organizations.
  • .info is for websites that only provide information that does not confuse you with the subdomain info. in the example above.

for geographic location:

  • .co for websites in Colombia.
  • .us for websites in United States.
  • .eu for websites in Europe.

And now do let’s enter the following URL and start the journey www.youtube.com ohhh by the way URL means (Uniform Resource Locator).

The Journey

When you press enter after writing the URL, the first thing your browser does is verify if it has the address of that domain in its cache memory, in case it does not have it, it will ask the operating system which will also search its cache memory and if none from the found the IP address the operating system will call the DNS.

DNS

The domain name system is a system that associates domain names with an IP address, basically this system is the one that translates URLs to IP addresses, this was done like this since humans are terrible at remembering numbers but we remember names better like this that it is easier for us to remember for example youtube.com than 154.85.45.7 for this reason the DNS helps us to translate domain names let’s continue.

Once DNS has brought the IP address to the operating system, it passes it to the browser and they both cache it so they don’t have to ask for DNS again.

Now that the browser has the IP address, it can search the Internet for the server with the IP address that the operating system gave it using the TCP / IP protocol.

TCP/IP

TCP is a protocol that allows the transmission of information between two computers, in our case the youtube.com server and our computer.

This protocol is characterized by being slow but safe, this is because it needs to send more information in its header to have a more stable connection and avoid the loss of information along the way, unlike the UDP protocol. which is much faster since it does not need to establish connection and its header is smaller.

TCP Header

UDP Header

we see that the difference in size is significant.

IP (Internet Protocol)

The Internet Protocol is a combination of 4 numbers between 0 and 255 separated by points that is used to identify a unique computer in the world, although this identification is already falling short by the number of computers and devices in the world, therefore that IPv6 is starting to be used, which is basically version 6 of IP.

IP only allows about 4 billion devices while IPv6 about 340 sextillion. It seems that they are not going to end for now. IPv6 normally looks like this.

2001:fd23:56a2:1231:0012:ad11:907b:7490

And now that we know this, our browser can finally make a request with the HTTPS protocol to the server, but first it must go through the firewall and give us an SSL key.

SSL Certificate

The S at the end of HTTPS that tells us that the site is secure is thanks to the SSL certificate which is a data encryption system with keys, one of these keys is held by the server (the decryption key) and the other key is provided by the server (the encryption key).

The encryption key as its name indicates it is the one that encrypts the information that we send to the server while the decryption key is the one that allows decrypting this information, which means that in order to know the information that travels from our computer to the server we need the decryption key that only the server has and no one else in the world.

This means that if someone manages to grasp this information, they will not be able to understand it unless they have the decryption key.

Firewall

Before this happens, our HTTPS request reaches the firewall a system that controls the connections that enter and leave a computer with the firewall you can block any connection that you do not want to enter your computer through the ports.

You can define certain rules such as that you only want to be able to establish a connection with port 443, which is the default port for HTTPS connections, which is obviously enabled so that our request can enter and is also the port for the TCP connection so this connection also successfully passes the dreaded Firewall.

After having crossed the firewall barrier it reaches the load balancer.

Load Balancer

the Load Balancer is a process that is responsible for efficiently distributing the incoming network traffic among several servers, the load balancer can be a software on a server or a hardware device only for load balancer.

Using a load balancer has many benefits, like these:

  • Reduced the work-load on an individual server.
  • Large amount of work done in same time due to concurrency.
  • Increased performance of your application because of faster response.
  • No single point of failure. In a load balanced environment, if a server crashes the application is still up and served by the other servers in the cluster.
  • When appropriate load balancing algorithm is used, it brings optimal and efficient utilization of the resources, as it eliminates the scenario of some server’s resources are getting used than others.
  • Scalability: We can increase or decrease the number of servers on the fly without bringing down the application.
  • Load balancing increases the reliability of your enterprise application.
  • Increased security as the physical servers and IPs are abstract in certain cases.

Load Balancers algorithms

Load balancers generally use algorithms for load distribution depending on the system and servers that are available, some of these algorithms are:

  • Round robin
    A batch of servers are programmed to handle load in a rotating sequential manner. The algorithm assumes that each device is able to process the same number of requests and isn’t able to account for active connections.
  • Weighted round robin
    Servers are rated based on the relative amount of requests each is able to process. Those having higher capacities are sent more requests.
  • Least connections
    Requests are sent to the server having the fewest number of active connections, assuming all connections generate an equal amount of server load.
  • Weighted least connections
    Servers are rated based on their processing capabilities. Load is distributed according to both the relative capacity of the servers and the number of active connections on each one.

And finally some load balancer software examples.

Software Load Balancer Examples

The following are few examples of software load balancers:

  1. HAProxy — A TCP load balancer.
  2. NGINX — A http load balancer with SSL termination support. (install Nginx on Linux).
  3. mod_athena — Apache based http load balancer.
  4. Varnish — A reverse proxy based load balancer.
  5. Balance — Open source TCP load balancer.
  6. LVS — Linux virtual server offering layer 4 load balancing.

Servers

Servers are physical or virtual computers that provide services to other computers, they are usually located in data centers where they are well cared for since they must be on all day every day.

Web Servers

Web server is software that lives on the server and are in charge of receiving and responding to HTTP requests so this is in charge of looking for the web page that we are requesting.

It is also responsible for returning errors if it cannot correctly respond to a request such as when it cannot find the requested address on a site, it returns the 404 error that tells the browser that it could not find the path specified in the URL.

Application server

The application server is a server specifically designed to run applications.

Run the necessary programs on the server and organizes the information in a HTML file so that the web server can return it to do this it can search information in the database if necessary to satisfy the request.

For example, when we request a list with the most prominent videos, the application server searches the database for this information and accommodates it within an HTML file so that the web server can send us the page with the list of videos.

Database

A database is an organized collection of structured information, or data, typically stored electronically in a computer system. A database is usually controlled by a database management system (DBMS). Together, the data and the DBMS, along with the applications that are associated with them, are referred to as a database system, often shortened to just database.

The database in the web infrastructure
It is responsible for having all the data to respond to an http request
they are generally repeated between servers and can be in the same server as the web application or in a separate server

End of travel

And so this trip ends from the moment you press enter until you receive your web page and you can see it on your computer. I hope it has been useful for you, here below I leave you the bibliography in case you want more information

Bibliography

https://www.infoworld.com/article/2077354/app-server-web-server-what-s-the-difference.html

--

--

Francisco Guzmán Herrera
Francisco Guzmán Herrera

Written by Francisco Guzmán Herrera

Software Engineer student at HolbertonSchool

No responses yet