Multiple servers

Learning objectives

You have an idea of the magnitude of the number of requests that largest web sites in the world receive.
You know how to create multiple instances of a Docker service by defining replicas.
You know how to define port ranges in Docker Compose.

Distributing workload to multiple servers

Sooner or later, when scaling web software to meet a growing demand, there is a need for horizontal scaling. As you might remember, horizontal scaling refers to adding more servers to handle the workload. As a classic example of horizontal scaling, the following frame contains a snapshot of NASA's website from late 1996 when people were actively looking for information about the Mars Pathfinder Mission.

To meet a high demand, NASA's website provided a list of links to servers, where each server hosted a copy of the actual NASA website (i.e. a mirror of the original site). Jointly, the listed servers could handle slightly more than 80 million hits per day, where the largest individual server at Silicon Graphics could handle approximately 20 million hits per day. The term "hit" meant an individual page load, including loading resources on the page (e.g. images). When averaged out, the 80 million hits per day corresponds to approximately 926 hits per second.

Traffic on modern websites? 🤔

The company Similarweb offers insight into the traffic that websites receive. Based on their data, in December 2022, the most popular website in the world was google.com, receiving approximately 86.4 billion visits during December 2022. The second most popular website was youtube.com with some 34.6 billion visits, and the third most popular website was facebook.com with some 18.1 billion visits.

Visits to modern websites are rather different to visits to websites in the 1990s. In the 1990s, when servers were mainly responsible for delivering static data, a page hit consisted of loading a page and its contents. With modern web sites, a single page visit may consist of hundreds of requests, and modern web applications also offer dynamic functionality such as streaming search results, media, or other types of content to the users. When considering the terms "visit" and "hit", a single visit may include traversing multiple pages, where each page load could be considered a "hit" in the traditional terminology.

In practice, the 86.4 billion visits per month received by Google in December 2022 would not have been handled even by a thousandfold increase in the number of servers similar to those that were used to handle the 80 million hits per day in the 1990s.

More servers with Docker Compose

In the next parts, we'll look into increasing the number of servers in a Docker application. In the subsequent terminology, we refer to servers and containers synonymously, considering containers as virtualized server instances. To try out the following example, download the walking skeleton project for Web Software Development.

When we take a look at the walking skeleton, we observe that the definition for running the service app in docker-compose.yml is as follows.

# ... additional content
  app:
    build: app
    image: app
    restart: "no"
    volumes:
      - ./app/:/app
      - ./app-cache/:/app-cache
    ports:
      - 7777:7777
    depends_on:
      - database
      - flyway
    env_file:
      - project.env
# ... additional content

The folder app -- relative to the docker-compose.yml file -- is used as a volume for the service app within the Docker image, similar to the folder app-cache. The definition also contains a port mapping; the port 7777 will be used to expose the port 7777 from within the service.

Docker Compose comes with a configuration related to the deployment and running of services. Using deploy configuration, we can specify a number of container replicas for a service. Let's adjust the configuration for the service app to match the following. That is, using the deploy configuration, we state that the service app has a single replica.

# ... additional content
  app:
    build: app
    image: app
    restart: "no"
    volumes:
      - ./app/:/app
      - ./app-cache/:/app-cache
    ports:
      - 7777:7777
    depends_on:
      - database
      - flyway
    env_file:
      - project.env
    deploy:
      replicas: 1
# ... additional content

When we run the command docker compose up, we see that the application starts up as expected and responds to requests at port 7777.

docker-compose vs docker compose

There are two variants of the command used to run Docker Compose. The docker-compose refers to an older version of Docker Compose (Python-based, v1), while docker compose refers to a newer version of Docker Compose (Go-based, v2).

For the deploy configuration, we'll rely the newer version.

Now, let's bump up the number of replicas to two and run the command docker compose up again.

# ... additional content
  app:
    build: app
    image: app
    restart: "no"
    volumes:
      - ./app/:/app
      - ./app-cache/:/app-cache
    ports:
      - 7777:7777
    depends_on:
      - database
      - flyway
    env_file:
      - project.env
    deploy:
      replicas: 2
# ... additional content

docker compose up
// ..
Error response from daemon: driver failed programming external connectivity
on endpoint wsd-walking-skeleton-app-2 (id): Bind for 0.0.0.0:7777 failed:
port is already allocated

When we try to run our application, we notice an error stating that the port 7777 is already allocated. Indeed, if we run docker ps, we see that there exists a running container that has taken the port 7777.

docker ps
CONTAINER ID  IMAGE  COMMAND  CREATED  STATUS    PORTS                                      NAMES
cc5f06b75076  app    "/..."   (time)   Up (time) 0.0.0.0:7777->7777/tcp, :::7777->7777/tcp  wsd-walking-skeleton-app-1
// ..

The error comes from our configuration explictly mapping the service app to the port 7777. In effect, both replicas try to claim the port 7777, although it can be claimed only by one replica.

To overcome the issue, let's adjust the configuration so that we provide a range of ports that can be used. Using "7777-7778:7777", we state that the ports 7777 and 7778 can be used for exposing the port 7777 from the docker container running a replica of the service app. With this, when launching multiple replicas, each of the replicas attempt to take one port from the available range.

# ... additional content
  app:
    build: app
    image: app
    restart: "no"
    volumes:
      - ./app/:/app
      - ./app-cache/:/app-cache
    ports:
      - "7777-7778:7777"
    depends_on:
      - database
      - flyway
    env_file:
      - project.env
    deploy:
      replicas: 2
# ... additional content

Now, when we run docker compose up, we no longer see the error. Further, when we run docker ps, we see two containers running the app image, each of the containers mapping to a specific port.

docker ps
CONTAINER ID  IMAGE  COMMAND  CREATED  STATUS    PORTS                                      NAMES
064517f3200c  app    "/..."   (time)   Up (time) 0.0.0.0:7777->7777/tcp, :::7777->7777/tcp  wsd-walking-skeleton-app-1
047999f06b6b  app    "/..."   (time)   Up (time) 0.0.0.0:7778->7777/tcp, :::7778->7777/tcp  wsd-walking-skeleton-app-2
// ..

Indeed, curling either port 7777 or 7778 provides now a response.

curl localhost:7777
Hello world!%
curl localhost:7778
Hello world!%

With this, we have two instances of our application running. To verify that this holds, let's adjust our app.js a bit to create a random identifier that is used as a constant to identify a running application. In the example below, we use the Web Crypto API to generate a Universally Unique Identifier (UUID).

import { serve } from "./deps.js";
import { sql } from "./database.js";

const SERVER_ID = crypto.randomUUID();

const logNames = async () => {
  const result = await sql`SELECT * FROM names`;
  console.log(result);
};

const handleRequest = (request) => {
  console.log(`Request to ${request.url}`);
  logNames();
  return new Response(`${SERVER_ID}: Hello world!`);
};

console.log("Launching server on port 7777");
serve(handleRequest, { port: 7777 });

Now, when checking the application again (and restarting it if --watch does not work), we see that the responses indeed do come from separate web servers.

curl localhost:7777
190cd656-68e4-4621-be61-ed4dbef412d4: Hello world!%
curl localhost:7777
190cd656-68e4-4621-be61-ed4dbef412d4: Hello world!%
curl localhost:7778
d1485cf2-78e5-49b0-a440-c51002e9c395: Hello world!%
curl localhost:7778
d1485cf2-78e5-49b0-a440-c51002e9c395: Hello world!%
curl localhost:7778
d1485cf2-78e5-49b0-a440-c51002e9c395: Hello world!%

Question not found or loading of the question is still in progress.

Swarm mode for Docker

As the number of replicas increases, additional servers will be needed for running them. Docker has a Swarm mode that allows managing a cluster of Docker Engines (i.e. a swarm in Docker language). We'll look into this and similar concepts when looking into container orchestration.