Load Balancing

4 min readJul 8, 2020

This is the first article in the series: Fundamental Concepts in System Design.

The Problem

Suppose you decided to make a web application based on your million dollar idea. You managed to get a server to host your web application. Fortunately, the application started gaining a lot of popularity and attracted a lot of users.

Even though the above story looks happy, there is a serious consequence to that, if not properly handled. To help you understand better, consider the server to be a human(let’s say X) and the purpose of the application, to tell whether a given number is even or odd. Well, under normal conditions, A will be able to give you the answer within seconds. Suppose, X got popular and millions of people came to him with the question. To serve people’s requests, you thought of getting two more experts. But that didn’t quite solve the problem. The result was a huge mess as there was no one to manage the crowd and direct them to the right person.

Role of a load balancer

Fundamentally, a load balancer does the same purpose as the crowd controller in the above scenario. A load balancer sits in between the client and the server to distribute the incoming requests among a set of servers so that the response time is minimal. This job of the load balancer is what we call, Load Balancing.

To understand why load balancing is required, it is essential to understand the two undesirable states of a system.

Overloaded state: Any task reaching a machine utilizes its resources. By resources, we mean the memory and the computing power of a machine. When a bunch of tasks arrives at a machine, the machine runs out of resources and is no longer able to handle the upcoming tasks. This state of a machine is said to the overloaded state.

Underloaded state: A machine is said to be in the underloaded state if its computing capability is much more compared to the number and complexity of the tasks it gets assigned to.

The primary task of a load balancer is the distribution of tasks from overloaded machines to underloaded machines.

A load balancer can take up any position within a web application:

between the user and web server
between the web server and the backend application server
between the backend application server and the database server

In addition to distributing tasks optimally across servers, a load balancer performs many other functionalities. Load balancers conduct constant health checks on the servers to identify the list of servers that are available. Doing so makes sure that the requests are not forwarded to servers that are down. It also maintains the state of load on the servers. Load balancers are also capable of doing SSL/TLS offloading, caching, compression, and intrusion detection.

Why load unbalancing occur?

The key concept here is that all requests coming from the users can’t be processed by a single server. The requests need to be routed to multiple servers to have a scalable system.

In this case, we need to take into account the heterogeneous nature of the servers. Each server varies in the amount of computing power it possesses. So do the user tasks. Each task varies in the amount of resources it requires. This heterogeneity demands lower-powered units to receive fewer requests or requests that require a smaller amount of computation and higher-powered units, the other way round.

The unpredictable and probabilistic traffic flow to the servers also leads to load unbalancing. This kind of traffic surge can be seen during a sale for e-commerce platforms. When this occurs, if the number of computing units is fixed, each of them suffers from overloading. Modern load balancers are capable of reacting to an increase or decrease in the traffic by scaling up and down the servers under them.

Load Balancing algorithms

The task of a load balancer is highly complicated. There are a lot of algorithms that have been developed which enables the load balancer to figure out the optimal mapping between the nodes and the tasks. Nevertheless, all load balancing algorithms try to optimize factors such as response time, execution time, execution cost, throughput, fault tolerance, task migration time, resource utilization, power, and energy consumption.

The different load balancing algorithms will be covered in detail in an upcoming article.

Load Balancer Cluster

Even though a load balancer can turn out to be one of the heroes in a scalable system, we need to make sure that it does not become a single point of failure. To solve this, we can have multiple load balancers to form a cluster. Each load balancer would then be conducting health checks on others and if one of them fails, the others would then take up its place to ensure complete availability.

Load balancers increase the availability and responsiveness of an application by leaps and bounds. Load balancers are so essential to a system that modern applications cannot function without them. What not to say, proper load balancing can reduce energy consumption and carbon emission and thereby help achieve Green Computing!

Stay tuned for the upcoming articles on System Design Concepts. If you found the article helpful, go ahead and grab me a coffee using the following link :)

Load Balancing

The Problem

Role of a load balancer

Why load unbalancing occur?

Load Balancing algorithms

Load Balancer Cluster

Maria Zacharia

Hey 👋 If you found my content to be useful, you can now buy me a coffee!

Written by Maria Zacharia

No responses yet