Load Balancer

Let’s say we have an app deployed on a signle server and that is serving all the incoming requests from users. As the user base grows or the traffic grows the single server won’t be able to serve high traffic and we will need scalling.

When we have a single server setup we can face issues like

Server getting overloaded with requests.
Server Failure causing unavailability.

For scalability we can either do vertical scalling where we add more computing resource to the server like more ram, storage and cpu power but after a point we won’t be able to increase the computing resource because of hardware constraints and that is where we can do horizontal scalling which means adding more server instances to handle the incoming traffic.

In horizatal scalling we wll have more than one server and distribute the incoming traffic to ther servers. Here we would want to distribute the incoming traffic evenly to all te servers to make sure we don’t overload any server abd that is where we will need a load balanceer.

What is a load balancer?

A load balancer ditributes incoming traffic across multiple servers to maximize performance of the servers, and ensure system avialability & scalability.

Why do we need load balancers?

Scalability: Distributes traffic among multiple backend servers.
High Availability & Fault Tolerance: Automatically routes traffic away from unhealthy servers.
Improved Performance: Reduces latency by directing traffic efficiently.
Zero Downtime Deployments: Smooth traffic shifting during updates.

How Load Balancers Work?

A client sends a request.
The load balancer receives the request.
It selects a backend server based on a load balancing algorithm.
Forwards the request to the selected server.
Server responds to the client (directly or via the load balancer).