Docker basics

Old notes about docker basic commands

Why User Docker??

It makes really easy to run software without the trouble of setup and installation

Whats is Docker??

Docker is an ecossystem of tools like:

Docker client (docker cli)
Docker server (docker daemon)
Docker machine
Docker images
Docker compose
Docker hub

The objective is create and run containers

When you run a program with docker, what happens is:

Docker cli reaches the Docker hub, and it downloads a single file called Image
Image is a file containing all the dependencies and configs required to run a program
This image gets stored in your hard drive
You can use this image to run a container (an instance of an image). Container is like a running program

docker run hello-world

This means we want to start a new container using a image called hello-world First, it searches the hello-world image in our computer, inside image cache. Since it’s empty, it goes to the Docker hub, a free repository with images you can download. Now we download it into our computer and store inside the image cache.

But really, what is a container???

First, an overview of the OS in our computer:

Kernel: is a running software process, govern access to all programs running and all physical hardware. Kind of connects these two parts (programs and hardware)
The programs running (like nodejs, chrome, spotify…) interact with the Kernel through System calls, which are like functions the Kernel expose. “If you wanna write a file to the hard drive, call this function”. The Kernel exposes these different “endpoint”, or system calls, so your running process can call it and do whatever they want.

Let’s imagine we have two programs running on our machine, one needs Python 2 and the other need Python 3. And our Hard Drive only can have Python 2 installed. Only one of your programs will run.

We can solve this problem with a thing called namespacing, which will look to our HD and segment specific resources. So in one side we can have Python 2 and in the other we can have Python 3. The kernel will then redirect the System call to the right portion of the HD. We can isolate resources based on what the program is asking for.

Another feature related to namespacing is control groups (cgroups) which limits the amount of resource a process can run, the amount of Memory, CPU, HD, Network..

So a Container would be a block containing a running process and the resources assigned to it. The program, the kernel and the necessary things to run you program, like RAM, Network, CPU and HD. A container is not a physical thing, is more abstract.

An Image is basically a file system snapshot, a copy of a very specific set of directories and files. We can have an image with Chrome and Python. An Image also have a start-up command. When we run the image, the kernel reserves a portion of the HD for the image, in this case, Chrome and Python would be there.

How Docker is running in your computer

namespacing and controlgroups belongs to Linux OS. But how MacOS and Windows run docker? You installed docker, and along that, installed Linux Virtual Machine. Is inside the VM that containers will be created. It will be a Linux Kernel hosting running processes inside containers. This Linux Kernel will be responsible for limiting access to the HD and etc..

Docker run in detail

docker run <image name> uses an image to create a container.

When we run docker run hello-world, there is a file image in our Hard Disk, with a filesystem snapshot, with only one program inside of it: hello-world. When we run, this snapshot goes to the Hard Drive and a container is separated for that. Then, after the container is created, the command inside the image starts the running process, it calls the kernel and the kernel talks with the needed resources.

Overwriting default commands

We can do: docker run <image name> <your command>

your command will overwrite the default command inside the image.

docker run busybox echo hi there, the command is echo hi there and will overwrite the start command.

If you do docker run busybox ls, you will see:

bin
dev
etc
home
proc
root
sys
tmp
usr
var

These folders are not your computer folders. They come from the snapshot of the image and were placed inside your Hard Drive, for your container.

These overwrite commands (or startup commands) only works because the image we are calling has a snapshot of files that contain the commands executables. So if you try to do a docker run hello-world echo hi there, it will not work because you tried to use echo inside the hello-world image files systems. And the only thing that exists inside the hello-world image is one single file, to print a hello world.

In the docker run busybox it worked with ls command because the file system of busybox image has the ls command,somewhere inside de bin, or another directory.

Listing running containers

docker ps

List all different containers running on your machine.

docker ps --all

All containers we run before.

Container lifecycle

docker run can be splitted into two different commands:

docker create and docker start.

docker create is just about getting the filesystem snapshot from the image and getting ready into the container. docker start is to execute the start command.

➜ docker create hello-world
f370e82d1c4980d68126ae708781ce0fc2d2a1a28e8809c97f62e3d6235c403b
➜ docker start -a f370e82d1c4980d68126ae708781ce0fc2d2a1a28e8809c97f62e3d6235c403b

The -a will attach to the container and give the output for that container

Restarting Stopped Containers

When we do docker ps --all, we see the EXITED containers but can still use them again.

docker start -a <containerID>

We cannot replace the default command when restarting the container as we did above. It will run the default command again.

Removing Stopped Containers

docker system prune

It will delete all stopped container plus other things, like your build cache, which is all images you’ve downloaded from dockerhub.

Retrieving Log Outputs

If you forget to attach -a to look at the container output, you can use docker logs <conatinerID> to retrieve all information from it.

docker logs don’t restart the container, just retrieve the outputs.

Stopping containers

docker stop <containerID>

With stop we emit a hardware signal, SIGTERM message, terminate signal, to stop a process and the container have some time to do a cleanup.

docker kill <containerID>

Emits a SIGKILL command, the container has to shut down right now, without any cleanup.

If you run a docker stop and the container don’t stop in 10 seconds, docker will fallback to docker kill.

Multi-Command Containers

For example when we run a docker run redis, we start the redis server inside docker. To interact with it, we have to use the redis-cli, but it has to be inside that container.

We can do it with docker exec, to execute an addittional command inside a container.

docker exec -it <containerID> <command>

it allow us to type an input directly in the container. <command> is the command we want to execute inside the container.

➜ docker exec -it 8934b3cb6923 redis-cli

The purpose of the `it` flag

When you are running docker on your machine, every container is running inside a Linux VM.

Every process we create in a Linux environment has 3 different comunication channels:

STDIN
STDOUT
STDERR

Are used to comunicate infos into the process (STDIN) or out of the process (STDOUT).

STDIN is stuff you type. STDOUT is stuff that shows on your screen. STDERR is stuff that shows on your screen as an error.

The it is two separate flags, -i and -t.

docker exec -i -t is same as docker exec -it

-i means we want to attach this new exec into the STDIN channel, so we can type things to the running process. -t is what display the text nicely formatted.

Getting a command Prompt in a container

We often will want to open a shell, or a terminal, into our docker container, so we can interact with it and doesn’t have to type docker exec all the time.

docker exec -it <containerID> sh

The sh is the magic. Now we’re in a docker container and can execute any unix commands. Full terminal access, excelent for debugging.

sh is the name of a programm, a shell. There are several command processors:

bash
powershell
zsh
sh

So sh is basically another program, to execute commands.

Starting with a shell

We could start a shell right after we start a docker container: docker run -it <imageName> sh.

docker run -it busybox sh

The downside is that you won’t be able to run any other process. It is more common to start a container with a primary process and then attach a running shell with docker exec -it ...

Container isolation

It has to be clear that two container do NOT share the same file systems.

Creating Docker Images

To do so, we will:

Create a dockerfile: a plain text file of configs, define how our container behaves, what programs it contains and what it does when starts.
Docker client: it’s the docker CLI, gets the file and passes to Docker Server
Docker server: Will do the heavy leafiting, will build a usable image that can be used to start a new container.

To create a dockerfile:

Specify a base image
Run some commands to add dependencies or other programs
Specify a startup command, to boot up a container.

Building a dockerfile

The goal now is to create an image that runs redis-server.

So let’s write the code and explain after:

Create a directory like “redis-image”
Create a Dockerfile file, without extension and first letter uppercase
Inside the file:

# Use an existing docker image as base
FROM alpine

# Download and install a dependency
RUN apk add --update redis

# Tell the image what to do when starts as a container
CMD ["redis-server"]

Go to terminal, inside the directory and:

docker build .

// "docker build ." will generate an ID

docker run <generatedID>

That’s it, we are running a redis-server with our own custom image

Dockerfile teardown

Every line from the Dockerfile starts with an instruction. It’s telling the docker server a specific preparation step on the image.

FROM specify the docker image we want to use as a base.

RUN execute some command while we are building our custom image

CMD what should be executed when our image starts a container.

These instructions are the most used and important.

What is a Base Image

Let’s start with an analogy: imagine you have an empty computer without an OS and have to install Chrome. You would first install any OS, then download a browser, go to the chrome page and download it, etc..

The FROM alpine is just like installing an OS into our empty Dockerfile. The Dockerfile has nothing, it’s just empty, so we need a starting point.

We are using alpine because it has a set of programs we need to start building our Dockerfile, just like Windows or MacOS has it’s own set of programs.

In this case, alpine helps us with the second command we runned: RUN apk add --update redis.

The apk is an apache package available in the alpine, and we used it to install Redis. So apk add --update redis was only possible because we installed alpine before.

The Build Process in Detail

When we executed docker build ., we are getting our Dockerfile and generating an image out of it. We serve our Dockerfile to the docker CLI.

The . is the build context, a set of files and folders we want to encapsulate to build our container. We will se more later on.

When we do docker build .:

We create a temporary container out of FROM alpine command.
We run the RUN apk add --update redis command inside the temporary container from alpine.
This command install redis in this temporary container.
We take a snapshot of the container file system with the installed Redis.
The temporary container is removed.
With CMD ["redis-server"], we look at the previous built image snapshot (with Redis) and buid a container out of it.
The CMD ["redis-server"] puts the redis-server as the primary startup command for this container, then it takes a snapshot out of it and also shuts down this container.
The end result is a final image with a filesystem snapshot with Redis and the startup command (redis-server)

Rebuilds with Cache

If we run docker build . again, it will use cache because docker see the Dockerfile commands didn’t change, and will retrieve the last result from cache of my local machine.

But the cache follow the order of the instructions, so if you change the order it will not use cache.

Try to always put new commands in your Dockerfile in the last parts of it, because Docker will only run the changed parts and will get from cache everything above.

Tagging an Image

When we do docker build . it will generate an image ID, so we can run the container by doing docker run <imageID>.

We can specify a name for this image, so we don’t need to reference it’s ID everytime we want to run the container.

We do so by docker build -t <tag the image> .

By convention, <tag the image> is:

your docker ID/project name:version

So will be docker build -t stephengrider/redis:latest .

Now we can start the container by doing

docker run stephengrider/redis

Don’t need to specify the version.

Manual Image Generation with Docker Commit

As we saw before, we can create an image from a container, creating snapshots of the container. Just like the temporary container do.

So we can both create a container from an image and vice versa.

Let’s see how manually create an image from the container, just for the sake of curiosity.

docker run -it alpine sh

The command above start a container with alpine, sh for starting the shell. Now inside the container shell:

apk add --update redis

Now we installed redis in that running container.

In another terminal, we will create a snapshot of that container and assign a startup command to it:

docker ps

docker commit -c 'CMD ["redis-server"]' <containerID>

The output is the ID of our new image we just created.

We can do docker run <imageID> and it will start redis server.

As you saw, there is a nice relation between images and container!

04 - Making real projects with Docker

Project Outline

We will create a simple NodeJS web app
Create a Dockefile, to create an image
Build image from the dockerfile
Run the image as a container on our local machine (to run our web app inside a container)
Connect from the browser to the web app

DISCLAIMER

The following lessons will have some mistakes, to teach you

NodeJS Server setup

A very simple NodeJS app

const express = require("express");

const app = express();

app.get("/", (req, res) => {
	res.send("Hi There");
});

app.listen(8080, () => {
	console.log("listening on PORT 8080");
});

A Few planned errors

The steps we have to make to start a NodeJS app is npm install and npm start, assuming we have npm installed.

To build our Dockerfile, the steps are similar to the previous time:

Specify a base image
Run commands to install adittional programs
Specify a command to run on a container start up

So let’s create a Dockerfile and:

# Specify a base image
FROM alpine

# Install dependencies
RUN npm install

# Default commands
CMD ["npm", "start"]

Let’s run that on our docker by doing: docker build .. It will give us an error npm: not found

Base image issues

We saw the previous errors because we used the base image from Alpine, which is very light and don’t have much. The npm is not available in our Alpine base image.

We could: Search another base image or install npm in the Alpine.

Let’s use another base image. Go to dockerhub (hub.docker.com) and search the node official repository.

This repository is a docker image with NodeJS pre-installed on it. We could build our image based on this Node repository.

In the dockerhub there are the versions of nodejs (in this case) we could use to build our image. In our dockerfile, we would write the version by doing FROM node:{{version}}. Let’s use the alpine version of this node repository, which means the most compact and lightweight.

FROM node:alpine

Now if we run docker build . again, it will pass the first step but crashes the second step when running npm install: ENOENT: no such file or directory, open '/package.json'

A few missing files

We have the package.json file in our directory, but the error says we don’t have it.

It’s because when we run RUN npm install, we get the node:alpine image File system snapshot and put in a container. There is no package.json inside this container, only the files from the node:alpine snapshot.

When you are building an image, the files inside your local directory will not be available inside your container, you have to explicity put them in the container.

Copying build files

We will use the COPY command, which will move local files in our directory to the docker container at our build process. The syntax is:

COPY {{ path to folder/file to copy on your local machine relative to the build context }} {{ place to copy stuff inside the container }}

build context is the dot in Docker build . for example

# Specify a base image
FROM node:alpine

# Install dependencies
COPY ./ ./
RUN npm install

# Default commands
CMD ["npm", "start"]

Now if you build your image it will give some warnings for package-lock.json but it will generate your image ID sucessfully.

Let’s generate a built image building with tag, instead of generating an ID:

docker build -t ricardohan/first-nodejs-app .

ricardohan is my dockerID from dockerHub

The last thing to do is start up the image and get the server running.

docker run ricardohan/first-nodejs-app

Container PORT Mapping

We can see that our node server is running in PORT 8080, but if we try to access in our browser, we don’t see it. It’s because we are trying to access our local machine PORT 8080, but the container running the node server has it’s own set of PORTS. So we have to map the local PORT to the container PORT.

We don’t do it in the dockerfile, we will do it in the build command, at runtime because it’s a runtime constraint, we only change it when we start the container.

We will add docker run -p <route incoming requests on local machine>:<port inside container> <image id / name>.

So: docker run -p 8080:8080 ricardohan/first-nodejs-app

Now it works!!

You can change the PORT mapping:

docker run -p 5000:8080 ricardohan/first-nodejs-app

Now the container still runs the node server in the 8080, but your local machine is running on 5000

Specifying a working directory

Let’s start a shell inside the container for debugging:

docker run -it ricardohan/first-nodejs-app sh

We will get a command prompt in the root directory of the container.

Inside the prompt, if you type ls we see all the files and folders, including our package.json, index.js, node_modules, etc.. together with the base folders from the container such as var, lib, etc..

This is not a good practice, so we will create a specific directory to contain all our project folder and files.

Let’s change the Dockerfile for that.

WORKDIR <folder path>

All instructions after the WORKDIR will be relative to the folder path we specified above.

Lets’ put WORKDIR /usr/app above the COPY instruction, so the working directory will be /usr/app.

If the directory doens’t exist, it will create one.

/usr is a good place to put our files, as /home or /var are also okay

Now if we can build our container again:

docker build -t ricardohan/first-nodejs-app .

And run it:

docker run -p 8080:8080 ricardohan/first-nodejs-app

In another terminal, let’s do a docker exec to start the shell in the running container:

docker ps
docker exec -it <container id> sh

We entered directly inside /usr/app directory.

Unnecessary Rebuilds

If we change some file of our local project, we need to rebuild the whole docker container so the changes appears. We will see more of this later, but for now, let’s try to optimize the build.

If we make some change to a file, the docker will see it and COPY all the files again. So everything after the COPY command will run again, including the npm install command. This is bad, we don’t have to install all the dependencies again right?

So let’s fix this.

Minimizing Cache Busting and Rebuilds

We will divide the COPY command in two steps:

# Install dependencies
COPY ./package.json ./
RUN npm install
COPY ./ ./

So first we are copying only the package.json file into the working directory. So npm install will only run if we change something to package.json file.

After that, we can COPY all the other files to the working directory.

Now, if we change some file of our project, the only line that will be changed is the COPY ./ ./ in our Dockerfile. That’s why we put it after the RUN npm install command.

It does make a difference the order of your commands in your Dockerfile.

5 - Docker Compose with Multiple Local Containers

App Overview

Make a container that has a web app which display the number of visits the web app receives.

We need a web server and Redis to store the number of visits. We could store inside the server but let’s make it more complex.

Later we will scale the node servers, but still mantain only one Redis server for storing the numebr of visits.

App server code

Code is inside /visits/index.js

Assembling a Dockerfile

This Dockerfile will handle the server side of the application, it will have nothing to do with Redis.

FROM node:alpine

WORKDIR '/app'

COPY package.json .
RUN npm install
COPY . .

CMD ['npm', 'start']

Basically same code from previous app.

Build this image with docker build .

Actually, we can tag it with docker build -t ricardohan/visits:latest .. Now instead of the ID we can reference this container by the tagged name.

Introducing docker-compose

Now we builded the image, we can run the container with docker run ricardohan/visits. We will get an ERROR message because there is no Redis server to connect to our server.

To run a Redis server we can do simply docker run redis. It will get the image from dockerhub and start the copy of redis in our local machine.

In another terminal, if we try to run our web server, we will get the same error saying we cannot connect to the redis server.

This happens because there are two separate containers running in our machine: the node server and redis. They do not communicate automatically between each other, so we have to set a network communication.

We can do this in two options: using docker CLI network feature in the terminal or use docker compose.

Docker CLI is a pain in the ass, we need to re-run the same commands every time.

So let’s use docker-compose, it’s a separate CLI tool. It exists so we don’t have to type a lot of commands and options in Docker CLI, repetitive things.

It’s very easy to start multiple docker containers and connect them.

Docker compose files

We will essentialy write the same commands we runned before (docker build…, docker run…) and write in a docker-compose.yml file. We will feed this file into docker-compose CLI and it will handle it.

In docker-compose.yml we will specify the containers we want to build, specify the images we want to use to create them, and map the ports of the containers to our local machine.

version: "3"
services:
  redis-server:
    image: "redis"
  node-app:
    build: .
    ports:
      - "4001: 8081"

We specified the version of docker-compose to use. The inside services, we specify all the containers to run and the images to build from. Since the node-app is built locally using the Dockerfile, instead of specify an image we specify how to build the Dockerfile. build: . means search for a Dockerfile in the current directory and build it. Then we mapped the PORT 8081 of the container to our local 4001.

Networking with docker-compose

By defining the two containers in the docker-compose, it automatically connects the in the same network. So we don’t have to connect them manually.

But how to access the redis server from our nodejs app?

const client = redis.createClient({
  host: 'redis-server',
  port: 6379
});

Usually if we are not using docker, we would put a URL inside the host, something like https://myredisurl.com. But since we are using docker, we can specify the name of the redis container in docker-compose.

When redis.createClient tries to connect to ‘redis-server’, the docker will see it and redirect this connection to the redis container. The default PORT of redis is 6379.

Docker-compose commands

Before we typed docker run <image name>, but now it’s docker-compose up. We don’t have to specify any image name because it’s already in the docker-compose file.

Before we had to build our image and then run it. But now we can do it in a single command:

docker-compose up --build

Stoping docker-compose containers

Before in Docker CLI we could start a container in the background with docker run -d redis.

The we could do docker ps, see all running container and stop them with docker stop <container ID>.

In docker-compose we can do docker-compose up -d to launch all containers in the background and docker-compose down to stop all of them.

Containers Maintenance with Docker-compose

How to deal with services that crash or have an error.

If our nodejs app has a process.exit(0) and crashes, our container will stop.

Automatic container restart

process.exit(0), 0 is a status code meaning we exited the app but everything is OK. This affects how docker stops our containers.

We will specify restart policies to our docker-compose.

We have 4 restart policies:

“no”: is the default to all containers, if a container crashes never attempt to restart it.
always: is a container crashes, always tries to restart it. Good for web servers, always have to stay up and running.
on-failure: only restart a container if it stops with an error code (other than 0). Good for worker processors.
unless-stopped: always restart unless we developers forcibly stop it.

version: "3"
services:
  redis-server:
    image: "redis"
  node-app:
    restart: always
    build: .
    ports:
      - "4001:8081"

Container status with docker-compose

docker-compose ps

Creating a Production-Grade Workflow

We will learn how to develop, test, deploy and redo everything again.

Flow specifics

We will work in a feature branch in github. The open a PR to master, run travisCI and if all tests are fine, will automatcally push to the AWS.

Dockers purpose

Docker makes some of those tasks really easy, but it’s not necesssary to the flow.

Project generation

We will make a React project and run in a container

Create a react project with create-react-app

Creating the Dev Dockerfile

We will wrap our create-react-app inside a Docker container for development purposes.

We will have 2 Dockerfiles: one from development and one for production

Let’s create the development Dockerfile. It’s a Dockerfile.dev file.

FROM node:alpine

WORKDIR '/app'

COPY package.json .
RUN npm install

COPY . .

CMD ["npm", "run", "start"]

Go to the terminal and let’s build our Dockerfile. Since we are using a Dockerfile.dev, we can’t use the command docker build . because it will search for a Dockerfile file. So to build a custom Dockerfile, we do docker build -f Dockerfile.dev. The flag -f is to specify the file name we want to build.

Duplicating dependencies

You will notice we get a message the first time we build it: Sending build context to Docker daemon 155.2MB. That’s because create-react-app already installed the dependencies (node_modules) in our working directory, so Docker is copying everything to the container again. We don’t need the dependencies locally, so we can exclude them, since we are installing everything in Docker.

If we build again, it will go much faster.

Starting the container

Remember we have to expose the docker container PORT to our localhost.

docker run -p 3000:3000 <containerID>

Docker Volumes

Now that our react app is running in our container, let’s try to have a hot-reload feature: every time we change something in code, we want to see the changes reflected in our running container.

Remember in our Dockerfile.dev, we are using the COPY . ., copying everything in our directory into the container. We have to change this approach and change the way we run our container. We will use a feature called Docker Volumes.

Docker Volumes don’t copy your files (/src and /public and whatever else) into the container, it makes a reference of those files into the container. It’s like mapping the PORTS of Docker into our localhost, but this time we are mapping the local files into the container.

The syntax to run the container with volumes is:

docker run -p 3000:3000 -v /app/node_modules -v $(pwd):/app <containerID>

Start explaining the -v $(pwd):/app part:

pwd is the path of present working directory. It’s a shortcut for (in my case) /home/ricardo/Documentos/docker/frontend.
It basically says, get everything in my present working directory and map to /app folder in the container.

Bookmarking Volumes

Since we are mapping everything in our pwd (present current directory) to the /app folder in the container, we are replacing the node_modules of the container too. But we don’t have a node_modules in our local working directory (we deleted it). So:

The -v /app/node_modules command says:

Don’t do anything to /app/node_modules, just use it.
We are basically putting a bookmark on the node_modules of the container (it’s the /app/node_modules)

In Docker, the : means: map something inside the container to something outside the container.

Now if we run the command to run the container with all the arguments, we will be able to access localhost:3000 and make changes to the files, and see the changes in the browser.

Note: hot reload is not a feature of Docker volume, it’s a feature of create-react-app. But it’s possible because of the Docker Volume, since we are running the app in a docker container.

Shorthand with Docker Compose

The Docker volume is cool but it’s a pain to type the whole command docker run -p 3000:3000 -v /app/node_modules -v \$(pwd):/app <containerID> everytime.

So let’s use docker-compose.yml for that:

version: "3"
services:
  web:
    build:
      context: .
      dockerfile: Dockerfile.dev
    ports:
      - "3000:3000"
    volumes:
      - /app/node_modules
      - .:/app

The build command, in the previous docker-compose file we did just build: .. But since build command looks for a Dockerfile and we are working with a Dockerfile.dev, we had to change things here. We specified the context: . which is the directory we want to work with in our container, and dockerfile: Dockerfile.dev is the file we want to use for building our image. The - .:/app line says: map the /app of the container to my current working directory .

Now just by doing docker-compose up we can run our container with the volumes working.

Do We Need Copy

We could just delete the COPY . . inside our Dockerfile.dev, because we are using docker volumes, so our container just reference all the file from our local working directory. But is good to leave it there because we may stop using docker-compose in the future, or we could use this Dockerfile.dev to build out production Dockerfile, so it’s better to leave it like this.

Executing tests

First we are going to run test locally, then run the tests in TravisCI, for deployement mode.

First build your image: docker build -f Dockerfile.dev .

Then we run a specific command to start up the container, in the end of docker run:

docker run -it <containerID> npm run test

The npm run test will overwrite the default startup command. The it flag is better to see the logs and you can also insert inputs.

Live updating tests

We have the same problem we had before: when updating our test files, it won’t update in docker container.

We have two approachs for that:

create a new service wth volumes in docker-compose.yml file
use the same service we have in docker-compose, start the container, attach to it and run our tests in a separate terminal.

Let’s do the second approach now:

Start your docker-compose container with docker-compose up.

In another terminal:

docker ps to get the container ID
docker exec -it <containerID> npm run test

Now you are attaching the running container and you can modify your local tests, it will update the tests.

Docker Compose for running tests

Now let’s do the first approach to run our tests in the container. Let’s add the test service in docker-compose.

version: "3"
services:
  web:
    build:
      context: .
      dockerfile: Dockerfile.dev
    ports:
      - "3000:3000"
    volumes:
      - /app/node_modules
      - .:/app
  tests:
    build:
      context: .
      dockerfile: Dockerfile.dev
    volumes:
      - /app/node_modules
      - .:/app
    command: ["npm", "run", "test"]

We added a second service to the docker-compose, so when we start it, it will build two services. The command line indicates which start up command we want to use, it overwrite the default startup command.

Let’s start the docker-compose with docker-compose run --build so we also build the image again, since we added a new service.

The downside now is that we can’t interact with the test service, it’s logging in our terminal but we are not attached to it.

Shortcomings on testing

We cannot interact with the test service because there is not stdin (standart input) connection from our local terminal to the container. Let’s open a second terminal and attach to the test container. Remember, attach makes the stdin possible, but won’t be able to write the test inputs.

If we inspect the running processes of the container with docker exec -it <containerID> sh, and inside do a ps command, we will see:

PID   USER     TIME  COMMAND
    1 root      0:00 npm
   24 root      0:00 node /app/node_modules/.bin/react-scripts start
   31 root      0:12 node /app/node_modules/react-scripts/scripts/start.js
   71 root      0:00 sh
   76 root      0:00 ps

When we attach, we always attach to the main process, the PID 1. When we run npm run test, what actually runs it’s the npm process above. It sees the additional args we passed (run test) and decide what to do. It starts one other process to run our test suit.

The problem is that we always attach to the main process, and the process that understands our test suit commands such as p, q and other test inputs, are running in another process. Well, sad we cannot attach to it.

Need for Nginx

Let’s start thinking about our production environment, to run npm run build.

When we are in development mode, we have a Development Server inside our container, serving our html and javascript files to the frontend, when we access localhost:3000. This development server watches for changes, and do nice stuff for development purposes.

In production this Dev Server disappears, because it’s not needed. When we run npm run build, we generate static files of html and javascript, and we have to serve them to the client, through a web server. So we will use Nginx, a very popular web server which sole purpose it’s routing and serving files to client.

We will create a separate Dockerfile for our production environment.

Multi Step Docker Builds

Our flow will be basically:

Use node:alpine
Copy package.json file (in order to do npm run build)
Install dependencies
Run npm run build
Start Nginx

The step of Installing dependencies is only necessary to run npm run build. We will generate the build directory and then we won’t need the dependencies anymore.

The last step of starting Nginx, we want to use another base image besides node:alpine, which is nginx.

We will use two base images then, with something called Multi step build process

In the Dockefile, we will have two block of configurations:

One is the build phase, using node:alpine as base image. We will install dependencies and generate the build files.
Another is the run phase, using nginx as base image. We will copy only the result of build phase (build files) and then start nginx, serving the build files.

Implementing Multi-Step Builds

Create a Dockerfile:

FROM node:alpine as builder
WORKDIR '/app'
COPY package.json .
RUN npm install
COPY . .
RUN npm run build

FROM nginx
COPY --from=builder /app/build /usr/share/nginx/html

We have two blocks of configs, each block has a FROM statement. With FROM, it automatically stops one phase and starts another.

We tagged the node:alpine to be know as builder phase.

So in the nginx phase, we reference the builder in --from=builder because we essentially are saying:

Get the /app/build from the builder phase and put in /usr/share/nginx/html directory, in the nginx container.

This /usr/share/nginx/html path is in nginx documentation.

Running Nginx

In the terminal, build the image docker build . Nice!

We builded the production ready image, now let’s run this image, mapping the correct ports:

docker run -p 8080:80 <imageID>

80 is the default port of Nginx

Nice, we see the content in port 8080, so now let’s think how to deploy to the outside world.

Continuous Integration and Deployment with AWS

Services Overview

We have containers for runnning npm run start and npm run test for development and npm run build in production.

Now we have to implement the continuous integration flow:

Feature branch to push code in Github
Master branch to open a PR to
Integration with TravisCI to run tests and deploy to AWS hosting

Github setup

Just create a public branch and push your local code there.

Travis CI Setup

Watch our new code we push to Github and do some work. We can do anything with the code, test, deploy, anything.

We are going to use Travis to test and automatically deploy to AWS if the code pass the test.

Travis YML File Config

We have to be explicit to say what travis should do.

Let’s create a .travis.yml file in the root of project directory, to say:

tell Travis to run a copy of docker, build an image and create a container
use Dockerfile.dev to build an image
run a test suite
how to deploy our code to AWS

In .travis.yml:

sudo: required
services:
  - docker

before_install:
  - docker build -t ricardohan/docker-react -f Dockerfile.dev .

First we have to have sudo permission to manipulate docker. Then we make it clear we need a copy of docker to run this Then we define a section before_install, to run steps that need to run before we run our tests. We are building and tagging the container.

A Touch More Travis Setup

In the .travis.yml, let’s tell travis how run our tests

sudo: required
services:
  - docker

before_install:
  - docker build -t ricardohan/docker-react -f Dockerfile.dev .

script:
  - docker run -e CI=true ricardohan/docker-react npm run test

script contains all commands to run our test suite. We are running our container with the npm run test command. The -- --coverage is because npm run test default behavior is to run and wait for our input to finish the test. The —coverage flag exits the test after it runs, and return a coverage result. Actually, Travis CI doesn’t care about the coverage result, just cares about the return code to be 0.

Automatic Build Creation

Let’s commit this changes to github, and let Travis work.

In the dashboard of Travis we should see the logs and sucessfull build of Travis

AWS Elastic Beanstalk

Let’s setup an automatic deploy to AWS, using ElasticBeanstalk.

Go to AWS dashboard, ElasticBeanstalk, create a new application and a new environment, select docker as environment and create it.

important: after we finish the course, we have to close down the application in AWS, otherwise we will be billed.

Travis Config for deployment

In .travisci.yml:

sudo: required
services:
  - docker

before_install:
  - docker build -t ricardohan/docker-react -f Dockerfile.dev .

script:
  - docker run -e CI=true ricardohan/docker-react npm run test

deploy:
  provider: elasticbeanstalk
  region: "us-east-2"
  app: "docker-react"
  env: "DockerReact-env"
  bucket_name: "elasticbeanstalk-us-east-2-709603168636"
  bucket_path: "docker-react"
  on:
    branch: master

We specified a new block called deploy. The provider tells Travis which service we are using for deploy.

region depends on where we create our EBS instance. See the EBS generated URL and see it.
name is the name of your EBS app.
env is the ENV name of the app, specified in your EBS dashboard.
bucket_name is like this: when EBS is created, AWS automatically creates a S3 bucket in our account, which is basically a Hard Drive to store things. When Travis wants to deploy our app, it zips all of our files into one and put in the generated S3 bucket. So this is the name of this bucket. To discover, just go to S3 in your AWS dashboard. The name would be something like elasticbeanstalk.<aws-region>.27e7r2
bucket_path: this S3 bucket created automatically for elasticbeanstalk is used for every EBS app we have. Inside that bucket, there are folders with the name of each EBS app. So the bucket_path is the name of your EBS app. The folder inside S3 bucket will only be created the first time you do a deploy.
Finally, we are saying to just deploy the app to EBS when code changes in the master branch.

Automated Deployments

We have to configure API keys to let Travis CI have access to our AWS account, and be able to deploy things. Let’s search for IAM in the AWS dash. IAM is the service to manage API keys used by outside services. Go to -> Users (to generate a new User used by travisCI), and add a new user.

Let’s give a descriptive name like docker-react-travis-ci and choose Programatic access in the access type.

Then we have to give permissions to this user, permissions as policies. We want to give access to EBS, so search for elasticbeanstalk and give full access to this user in EBS.

The you create a user. This generates API keys to be used in TravisCI. The secret access key is showed only one time, so write it down somewhere.

This example keys are:

ID key: AKIA2KN5FCF6PR5R7GFX Secret access key: LtxR2OcKGZNPNgfG4QQSn66D4AdenpkGD799EYEW

Grab your Access Key ID and the secret access key. Let’s not put them directly in the yml file, let’s use a environment feature that TravisCI has. Go to travis dashboard, in your project go to more options -> settings -> env variables.

Choose a name and paste your key.

Then in your yml file:

deploy:
  provider: elasticbeanstalk
  region: "us-east-2"
  app: "docker-react"
  env: "DockerReact-env"
  bucket_name: "elasticbeanstalk-us-east-2-709603168636"
  bucket_path: "docker-react"
  on:
    branch: master
  access_key_id: $AWS_ACCESS_KEY
  secret_access_key:
    secure: "$AWS_SECRET_KEY"

The variable names like $AWS_ACCESS_KEY and $AWS_SECRET_KEY are the names we gave in the TravisCi dashboard to the api keys.

Let’s commit and push to the master branch, and let’s hope it gets deployed.

Exposing Ports Through the Dockerfile

We notice after pushing our code, Travis CI runs and deploying our app, that in the EBS dashboard, if we try to access the app, it won’t load. That’s because we didn’t expose the PORTS of our app. To do so with EBS:

In our Dockerfile:

FROM nginx
EXPOSE 80
COPY --from=builder /app/build /usr/share/nginx/html

We added the EXPOSE 80 so ElasticBeanStalk understand which PORT to expose for us. For us developers, this EXPOSE makes no difference, is only for EBS.

Workflow with Github

Creates a feature branch, make changes, push into feature branch and then merge to master.

Redeploying on pull request

When we merge that PR to master, TravisCi will redeploy the app to EBS.

Deployment Wrapup

Docker is not needded for this flow, but it facilitates much! We set this pipeline one time and we are done!

Building a Multi-Container Application (IMPORTANT STUFF)

Single Container Deployments Issues

The previous app was a single React App, no outside services like databases for example. Also, we builded the image multiple times, one in Travis CI and other in our server at AWS ElasticBeanstalk.

So now we are going to tackle these issues.

Application Overview

It’s a Fibonacci calculator, with server calculating the value for an index value.

Application Architecture

The user will access a url route, this triggers a request to our Nginx server, which sole purpose is to do some routing. If teh request is trying to access a file like HTML, Nginx will route to our React server. If the user is trying to access our backend API to change and submitt numbers, it will route to our express server.

The express server will communicate with Redis (to display the previous calculated values) and Postgres (to save values the user have used before, or submitted)

We are just doing this to use both Redis and Postgres, just for learning purposes.

Postgres will store the indexes (numbers) calculated before, and Redis will store the numbers and calculate the Fibonacci values, using a worker. The worker is a separate NodeJS sever that will watch Redis, get the numbers stores and do the Fibonacci calculations, and return to Redis.

Worker Process setup

The setup is in the /complex/worker folder, we are creating a Redis client and subscribing to insert. Every time there is an insert, we call the sub.on and run the fib function, and insert the values into the values hash with hset

Express API Setup

Setup is in /complex/server

Connecting to Postgres

Setup is in /complex/server/index.js

More Express API Setup

Make sure the express has connection to Redis that we created before

Fetching Data in the react app

Calling the routes we defined, to get all indexes and the fibonacci values

Evertyhing in /complex/client/Fib.js

Rendering Logic in App

Evertyhing in /complex/client/Fib.js

Routing the react app

Linking the other page with the Fibonacci page

We have to make sure the Nginx get the correct html5 routing. We talk about it later

Dockerizing Multiple Services

Dockerizing a React App - Again

Let’s start adding docker containers to each service (server, worker and client), so we can run them in a development environment (creating dev Dockerfile for each one)

We are focusing on development versions first, to make sure our workflow is smooth. This means every time a piece of code changes in client, server or worker, we don’t want to rebuild the image again to see th changes.

We will do, in each project:

Copy over package.json
Run npm install
Copy over everything else
Create a docker-compose to set up volumes to each of these projects, so we share the source code inside each project. This will make sure we don’t rebuild our image entirely in every little change.

Let’s start with the React project, inside the /client directory.

Create a Dockerfile.dev
Fill the content:

FROM node:alpine
WORKDIR '/app'
COPY ./package.json ./
RUN npm install
COPY . .
CMD ["npm", "run", "start"]

We are creating an image, just the way we created the frontend project before.

In the terminal, build the image created: docker build -f Dockerfile.dev .
After the successfull build, get the container id and: docker run {containerID}

See the project running =)

Dockerinzing Generic Node App

Now let’s dockerize the server directory.

Define a Dockerfile.dev into /server directory and:

FROM node:alpine
WORKDIR "/app"
COPY ./package.json ./
RUN npm install
COPY . .
CMD ["npm", "run", "dev"]

Define the same Dockerfile.dev into the /worker directory also. With the same content! Then build the image (in the /server or /worker dir): docker build -f Dockerfile.dev . to see if everything is working. It will give ERRCONNECTION error because we don’t have Postgres yet.

Adding Postgres as a Service

Now we have Dockerfile.dev for each project, let’s put together a docker-compose file. The puprose of docker-compose is to make it easier to startup the containers with the right arguments and configurations (expose the PORTS, connect server and worker to services like Redis and Postgres, define environment variables to the projects, etc…)

First, let’s put together the Express server, the Postgres and the Redis.

What we need to thing before add things to docker-compose:

What images to use?
What Dockerfile to use (depending on environment)
Specify the volumes so the container change when the code changes
Specify env variables

Inside the root project dir (/complex), create a docker-compose.yml file:

version: "3"
services:
  postgres:
    image: "postgres:latest"
    environment:
      - POSTGRES_PASSWORD=postgres_password

We specified the first service, called postgres (the name we gave). And inside that service, we specifies the image to use, from DockerHub, which is postgres:latest. Always go to DockerHub and see the service available for yout to use. The is a nice documentation there also.

We explain the environment later.

If we run docker-compose up, we would have a copy of postgres available for us.

Dcoker-compose Config

Now let’s add the Redis service and then move to the server and specify config for that in the compose file:

version: "3"
services:
  postgres:
    image: "postgres:latest"
    environment:
      - POSTGRES_PASSWORD=postgres_password
  redis:
    image: "redis:latest"
  api:
    build:
      dockerfile: Dockerfile.dev
      context: ./server
    volumes:
      - /app/node_modules
      - ./server:/app

We specifies the api service and some config we already know. build is using the Dockerfile.dev to build the server service, but which one? That’s why we specified context, so docker-compose knows which Dockerfile.dev to use (in this case, inside ./server).

Also specified the volumes, which are mapping everything from /app directory (we defined /app inside the Dockerfile.dev, to receive all the builded files) to the ./server directory. The only exception is /app/node_modules, which will stay the way it is. That’s why everytime we change some code in the ./server, it maps to /app inside the container.

Environment variables with docker compose

The server uses a lot of environment variables to connect to postgres and redis. There are two ways of defining env variables in docker-compose.

Both of them set the variable in the container at run time. This means the variables are not builded with the container image, but only defined when we run the container.

variableName=value. Defining like this, will make the container use the value defined.

variableName. Defining like this, will make the container search for the value in your computer.

We will use variableName=value most of the time.

  server:
    build:
      dockerfile: Dockerfile.dev
      context: ./server
    volumes:
      - /app/node_modules
      - ./server:/app
    environment:
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - PGUSER=postgres
      - PGHOST=postgres
      - PGDATABASE=postgres
      - PGPASSWORD=postgres_password
      - PGPORT=5432

Define a environment and the key value pair for each variable. The value for REDIS_HOST is just redis, because is the name of the service we are using to start redis. The PORT is 6379 because is the default Redis port, is in the docs. All the PG variables are also in the DockerHub documentation, these are the default values

If we do docker-compose up we should see everything running nicely.

The Worker and Client Services

Now we need to add services for both the worker and client projects. In docker-compose:

  client:
    build:
      dockerfile: Dockerfile.dev
      context: ./client
    volumes:
      - /app/node_modules
      - ./client:/app
  worker:
    build:
      dockerfile: Dockerfile.dev
      context: ./worker
    volumes:
      - /app/node_modules
      - ./worker:/app

Now we are almost done, but note we didn’t define Port mapping in any service. Let’s talk about Nginx first.

Nginx Path Routing

In the previous project, we used Nginx to host our React project in development mode. Now we will use Nginx in development mode to route the requests.

The browser will make request (to get HMTL and Javascript, or to make API requests) and the Nginx will route either to React server or to Express server.

We will add a new container for Nginx, adding a new service to the docker-compose file. It will intercept requests from the browser and, if the path have a /api, send to Express server. If begins only with ’/’, send to React server.

Why not use PORTS to define the React server and the Express server? Because Ports are not reliable and if we think about production environment, it’s so much better to rely on the request path.

Note that our route handlers in Express don’t have the /api path. That’s because after the request go through Nginx, Nginx will cut off the /api part and foward the request to Express. That’s optional, we could easily modify all routes in Express to begin with /api, but in this case not doing it.

Routing with Nginx

We will define a default.conf file, which add config rules to Nginx. It will then be added to our Nginx image.

In this default.conf file, we will:

Tell Nginx there is an upstream server at client:3000 and server:5000.
upstream server is a server that is “behind” Nginx
client:3000 is the create-reat-app server, client is the service name we defined in docker-compose
server:5000 is the express server, server is the service name in compose.
3000 is the default port for creat-react-app
5000 is a Port we defined in Express
Tell Nginx to listen to Port 80 (inside the container)
Manage to send the requests to the right place

Now let’s create a folder nginx in the root folder, inside the folder create a default.conf file:

upstream client {
  server client:3000;
}

upstream api {
  server api:5000;
}

server {
  listen 80;

  location / {
    proxy_pass http://client;
  }

  location /api {
    rewrite /api/(.*) /$1 break;
    proxy_pass http://api;
  }
}

Above we defined a upstream client, where the server is in the client:3000 (client is the name of the service that will be running)

Then defined the api upstream, which references the api service in the port 5000.

Then we defined the server, which is the Nginx server, and passed some configs. First Nginx should listen to Port 80, then we are intercepting the requests for ”/” and proxying to http://client, where client is our client upstream. The same way, we listen to “/api”, proxy to http://api but first we rewrite the path.

/api/(.*) is a regex path, match everything that begins with /api. Then, with /$1 we are getting everything that was matched with the regex (.*) and writing it to the final result. So basically we are erasing the “/api” part of the path. The break statemente guarantees the Nginx will go to the proxy_pass after the rewriting.

Building a custom Nginx Image

Now let’s create a Dockerfile to create our custom Nginx image, and then pass the Nginx configs to the Nginx image.

Go to DockerHub and see the nginx image. There is a lot of nice info there.

All the configuration for Nginx is done by these config files there are in the system. We just have to copy our config file to the Nginx image.

Create a Dockerfile.dev inside the /nginx folder and:

FROM nginx
COPY ./default.conf /etc/nginx/conf.d/default.conf

The base image is nginx and we are copying our default.conf into the nginx default.conf, actually we are overwriting the default file with our own file.

Now let’s add the nginx service to docker-compose:

nginx:
  restart: always
  build:
    dockerfile: Dockerfile.dev
    context: ./nginx
  depends_on:
    - api
    - client
  ports:
    - "3050:80"

Using restart: always because we need nginx running all the time, since it’s responsible for routing our application to the right servers. It’s the only service with port mapping, because all other port mapping is done by the nginx service.

Troubleshooting Startup bugs

If any bug happens, read the error message. But you can always run docker-compose down and docker-compose up again.

And if you make any changes to docker files, run docker-compose up --build to rebuild the images.

Opening websocket connections

We are seeing a websocket error in the console if we open the localhost:3050. This is because our React app in the browser wants to connect to a websocket to talk with the React server, so everytime the react code changes, the browser knows and update. But we didn’t implement this with Nginx, so let’s do it now.

In the default.conf file, add one more route proxy:

server {
  listen 80;

  location / {
    proxy_pass http://client;
  }

  location /sockjs-node {
    proxy_pass http://client;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "Upgrade";
  }

  location /api {
    rewrite /api/(.*) /$1 break;
    proxy_pass http://api;
  }
}

Now rebuild the container again docker-compose up --build

A Continuous Integration Workflow for Multiple Images

Let’s review how we deployed our single container app previously:

Pushed code to github
Setup TravisCI to pulls our repo
Travis builds the image running docker build and test the code (npm run test)
Travis push the code to ElasticBeanstalk (using the .travis.yml file)
ElasticBeanstalk take the code, build the image again and deploy to the web server.

This flow is not the best for multi containers apps, since ElasticBeanstalk has to build the image, pulls all the dependencies and etc, the server has to do much more than just serve the code.

For the multi container setup, we will:

Push the code to Github
Travis pulls the repo
Travis builds a test image (the react image), just to test the code
After the test is successfull, Travis builds a production image (with a production build script), using the Dockerfile for each one of the services
Travis will then push the buit prod images (the images are just single files) to DockerHub service
- we can host our personal or organization projects in Dockerhub as well
- AWS and Google Cloud are natively tied up with DockerHub to pull images from there, so we can easily make ElasticBeanstalk get our built images from there
Travis will send a message ElasticBeanstalk, tell them there is a new image to pull on DockerHub
ElasticBeanstalk will pull the images from DockerHub (images are already builded) and use as base for a new deployement

We will no longer depend on ElasticBeanstalk to build our images, everything will be done by Travis, push to DockerHub and then we can deploy anywhere we want. DockerHub is like the central repository for the Docker world.

Production Dockerfiles

Let’s create production versions of the Dockerfile to each one of our services. Right now we only have development versions of Dockerfiles.

For the server, worker and nginx services, we basically just modify the CMD, using a command more production appropriate, like npm run start.

Multiple Nginx Instances

Now let’s create a production Dockerfile to our /client service, our React project.

In the single container Dockerfile we did previously for the react application, our Nginx server was at ElasticBeanstalk, in PORT 80. Everytime a user enter the website, a request is made to the PORT 80 and the React production files are served from Nginx.

Now, the architecture is different.

The browser will make a request into the Nginx server responsible for routing (in PORT 80) and send the request to the right server (react server or express server). The React server will be in PORT 3000, it’s a Nginx with production React files.

So we will have two copies of Nginx, one for routing in PORT 80 and one for serving the production React files in PORT 3000. We could have 2 different services here, not just Nginx. It could be Nginx for routing and other service to serve the React files.

Altering Nginx Listen Port

In the /client folder, create a /nginx folder and inside a default.conf file:

server {
  listen 3000;

  location / {
    root /usr/share/nginx/html;
    index index.html index.htm;
    try_files $uri $uri/ /index.html;
  }
}

Everytime a user goes to ”/” path, share the files inside /usr/share/nginx/html (we will put the react production files here). The index will be the index.html. The last line of try_files is to Nginx work along with react-router.

Also inside the /client, create a Dockerfile and:

FROM node:alpine as builder
WORKDIR '/app'
COPY ./package.json ./
RUN npm install
COPY . .
RUN npm run build

FROM nginx
EXPOSE 3000
COPY ./nginx/default.conf /etc/nginx/conf.d/default.conf
COPY --from=builder /app/build /usr/share/nginx/html

Github and Travis CI Setup

Inside /complex, initialize a git repo, create a github repo, set the remote and push to github. Then activate the github repo on Travis CI (on Travis CI website)

Travis Configuration Setup

Let’s create our .travis.yml file, to handle our build when we push code to Github.

We will:

Specify docker as a dependency
Build a test version of React project (using the development dockerfile)
Run tests
Build productionv ersion of all projects
Push the images to Dockerhub
Tell ElasticBeanstalk to update

We are only going to run test on the React project. We could very easily build test versions for other services like /server or /worker, and run tests along with the React project.

Create a .travis.yml inside the /complex:

sudo: required
services:
  - docker

before_install:
  - docker build -t ricardohan93/react-test -f ./client/Dockerfile.dev ./client

script:
  - docker run -e CI=true ricardohan93/react-test npm test

after_success:
  - docker build -t ricardohan93/multi-client ./client
  - docker build -t ricardohan93/multi-nginx ./nginx
  - docker build -t ricardohan93/multi-server ./server
  - docker build -t ricardohan93/multi-worker ./worker

Before install we created a test version of the react project (./client) building the Dockefile.dev Then we runned the test inside the React project (we tagged the image with the name of ricardohan93/react-test) Then after successfully testing, we builded the production versions of the images.

Now we are ready to push these images to DockerHub

Pushing Images to Docker Hub

Before pushing to Docker Hub, we need to login to Docker CLI, using our username and password.

docker login

This will associate our docker client with our dockerhub account.

So before pushing an image to DockerHub, in the Travis script, we need to login to docker CLI.

We put the dockerid and password as Travis env variables, in Travis dashboard for the repo. And then, in the .travis.yml

after_success:
  - docker build -t ricardohan/multi-client ./client
  - docker build -t ricardohan/multi-nginx ./nginx
  - docker build -t ricardohan/multi-server ./server
  - docker build -t ricardohan/multi-worker ./worker
  - echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_ID" --password-stdin
  - docker push ricardohan/multi-client
  - docker push ricardohan/multi-nginx
  - docker push ricardohan/multi-server
  - docker push ricardohan/multi-worker

We are login in the docker CLI and then pushing all the builded images into docker hub. Explaining the login part:

by doing echo in the “$DOCKER_PASSWORD”, we passed the “$DOCKER_PASSWORD” to the other side of the pipe.
so we are doing docker login, getting the “$DOCKER_ID” and then retrieving the password which was passed as an standard input (stdin)
The “$DOCKER_ID” and “$DOCKER_PASSWORD” are set as environment variables from Travis.

That’s it. If we add, commit and push this .travis.yml file to github, Travis will awake and start build our project with all the steps we defined.

Multi-Container Deployments to AWS

Multi-Container definition files

Now we have a pre-deploy setup ready. Every time we push code to github, Travis CI build new images and push to Docker Hub. Now we need to use these images and deploy them in production.

We will use Amazon ElasticBeanstalk. Before with a single container, Amazon BE builded the image automatically. Now we have many services, and many dockerfiles. We have to tell Amazon EB how to treat the project.

We will create a config file Dockerrun.aws.json, to tell EB where to pull the images from, what resources to allocate to each, setup PORT mappings, etc… Remind a lot of docker-compose.yml

Basically docker-compose is for development and Dockerrun.aws.json is for production in AWS. In docker-compose we have separate services, but in Dockerrun they are called “Container Definitions”.

The biggest difference is that docker-compose has directions to build the images. Dockerrun is used to pull the image from DockerHub (images are already builded) and then Amazon EB can use the image

Finding Docs on Container Definitions

So, in reality ElasticBeanstalk don’t know how to handle containers. It delegate the job to another service called Amazon Elastic Container Service (ECS). Inside Amazon ECS you can create task definitions, which are instructions on how to run a single container. These task definitions are very very similar to the Dockerrun “Container definitions”. So in order to work with Container Definitions, we can look into the documentation of ECS task definition.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definitions

Adding conatiner definitions to DockerRun

In the root folder /complex, create a Dockerrun.aws.json file:

{
	"AWSEBDockerrunVersion": 2,
	"containerDefinitions": [
		{
			"name": "client",
			"image": "ricardohan/multi-client",
			"hostname": "client",
      "essential": false,
		}
	]
}

Inside containerDefinitions, each entry (or object) is a different container that will be deployed to AWS. It has a name (can be any name but we mantain the pattern and name client as well). Also has a target image, and we can just specify our DockerHub name and the image name because these deployement services like AWS have integration with DockerHub. That’s why we can reference ricardohan/multi-client directly.

Also have a hostname. Remember the “services” names in docker-compose, where we specify each container we had? All those names inside the “services” were like a hostname, so every container managed by docker-compose could reference other containers just by it’s hostname. This hostname here is the same, but for AWS context.

essential is false because indicates this container “client” is not essential. If a container is essential true, when it has a problem, all other container will shut down together with the essential one. At least one container here has to be essential: true.

More Container Definitions

{
	"AWSEBDockerrunVersion": 2,
	"containerDefinitions": [
		{
			"name": "client",
			"image": "ricardohan/multi-client",
			"hostname": "client",
			"essential": false
		},
		{
			"name": "server",
			"image": "ricardohan/multi-server",
			"hostname": "api",
			"essential": false
		},
		{
			"name": "worker",
			"image": "ricardohan/multi-worker",
			"hostname": "worker",
			"essential": false
		}
	]
}

Forming Container Links

Now let’s create the container definition for the Nginx container:

{
  "name": "nginx",
  "image": "ricardohan/multi-nginx",
  "essential": true,
  "portMappings": [
    {
      "hostPort": 80,
      "containerPort": 80
    }
  ],
  "links": ["client", "server"]
}

Nginx doesn’t need a “hostname” because no other container is referencing Nginx by name. We have a portMappings here, which just map the host port to the container port. So when accessing the port 80 on the host (the server), we will reach to the port 80 inside the container, which is the default port for nginx. We also defined a links property, it makes a explicit connection beetween the nginx and the dependent containers, in this case client and server. It’s like nginx has to know about the client and server (these names map to the name defined in the containerDefinition)

Creating the EB Environment

Go to the AWS console, search for Elastic Beanstalk. Create a new project, then create a new environment, a web server environment, the plataform should be Multi-conatiner Docker, and is a sample application (form option). Then create your environment.

Note that in all discussion about the containers we never talked about Postgres and Redis! Let’s talk about running databases in containers.

Managed Data Service Providers

In our Dockerrun.aws.json file, we have nothing with Postgres or Redis, as opose to our docker-compose in which we have declared the Postgres and Redis instances. Let’s talk about that.

Remembering that docker-compose is used for development environment and DOckerrun.aws.json is for production.

In the development environment, we have our Redis and Postgres running as containers, along wth the server, worker, client…

In the production environment, we will change this architechture:

Inside the Elastic Beanstalk instance, we will run the Nginx, the Express server, the Nginx with client files and the worker. We already setup these services to run with Amazon EB, inside the Dockerrun.aws.json.

The Redis and Postgres will be in external services: Postgres will be in RDS (Amazon Relational Database Service) and Redis will be in AWS Elastic Cache. These services are general services that can be used with any application.

But why we will use them to host Redis and Postgres instead of defining our own containers to host them?

AWS Elastic Cache:

Automatically create and mantain an instance of Redis for us, with all production ready features.
Super easy to scale
Built in log and maintainance
Security
Easier to migrate off ElasticBeanstalk, if we ever want to go off EBS because Elastic Cache is completely separate from it.

AWS RDS:

Same points above.
Automated backups and rollbacks
- Make database backups are hard to do

In this project we will use these external AWS products, and it’s recommended, although it cost money, is a small amount and save a lot of our time.

In the next project we will learn how to setup Postgres and Redis from scratch though.

Overview of AWS VPC’s and Security Groups

We will have our ElasticBeanstalk with our containers inside. Then we will have to connect some of these containers with the RDS (postgres) and with EC (Redis). By default these services cannot talk to each other. But we can create a connection beetween them, using a set of configuration in AWS. It is completely not connected with Docker.

AWS works with regions, you can see your region in the console. Each region has a default VPC (Virtual Private Cloud) set just for your account. VPC is like a private network for your services in that specific region. So our ElasticBeanstalk, RDS and EC will be inside a VPC in us-east-2 (Ohio) for example.

VPC is also used to implement a lot of security rules and connect services.

To make the services inside a VPC to connect to each other, we have to create a Security Group. Security Group is basically a firewall rule. It allow or disallow different services to connect and talk to your services in your VPC.

For example, in our elasticBeanstalk, we allow any IP in the world to connect, as incoming traffic, into PORT 80 of our EBS service.

To see the security groups, on VPC, click the Securty Group column, you can see your service and the inbound and outbound rules for it.

To make our services talk to each other, we have to create a security group and allow other AWS services in the same Security Group to talk to our service. That’s how we make them talk to each other.

RDS Database Creation

Make sure your are in the same region as your EBS instance. Search for RDS in the console.

Click in “Create Database”, select your database type (use free tier to not pay).

In the settings, choose a identifier name and your credentials. Remember the master username and password, because you will need them to connect you EBS to the Postgres. Fill some other aditional configs and create you database.

ElastiCache Redis Creation

Search ElastiCache in the console. Find Redis in the left column and create a new cluster. Just make sure in the “Node type”, you have selected the T2 micro instance, otherwise you’ll pay a lot of money. Also change the numebr of replicas to 0, we don’t need any replicas.

Make sure your VPC ID is the default VPC, so we can communicate to EBS and other services in our VPC. Then create your Redis instance.

Creating a custom security group

Now that we created Redis and Postgres, let’s create the security group for these services to communicate with each other and EBS.

Search VPC, Security Groups, create new security group. Choose the name and a description, and select the default VPC.

Add a inbound rule:

Choose the type as Custom TCP rule
Protocol is TCP
Port Range: we could open all Ports, but since we only need to open the Postgres PORT and Redis PORT, let’s do the range of 5432-6379
Source: put your security group id here, so the services can connect to each other within the same security group.

Applying Security Groups to Resources

Now we created the security group rules, we need to apply them to the services (RDS, EBS, EC), so they actually belong to the security group.

First EC (redis). Search in the console, go to your created instance and click the actions button and then modify. In the VPC security group, edit and choose your created security group (can leave the default selected as well)

Now to RDS, click in your database, modify button and select your new created security group. Then apply the changes and that’s it.

Now to EBS. Select your instance, click the configuration column on the left side. Click in “Instances” and select your new security group. Apply and confirm the changes.

Setting environment variables

Now we need to guarantee our containers inside EBS know how to reach out to RDS (Postgres) and EC (Redis). For this, we need to set env variables, just like we did inside docker-compose.

In docker-compose the “api” and the “worker” have environment variables defined. So we have to define them in AWS as well.

In the EBS page, go to configurations and modify software. You can see a key-value input to place your env variables.

For REDIS_HOST, we cannot simply put “redis” as is in docker-compose. We have to put the endpoint of ElastiCache. Go to ElastiCache -> redis -> primary endpoint and copy the URL, not the PORT. Do the same for env variables like PGHOST. Go to RDS and see the host endpoint and paste in the env variable section.

All the env variables defined in the EBS are applicable to the containers inside EBS.

IAM Keys for Deployment

In the travis.yml file, we are building the production images and pushing them to DockerHub. Now we have to send a message to EBS and say: “It’s time to pull these images and deploy them!”

We will send the entire project to EBS, but EBS only need to see the Dockerrun.aws.json file. It will see this file and pull the images from DockerHub.

Before writing things down to travis.yml file, let’s generate a new set of keys in AWS IAM console.

Search for IAM -> Users (to create a new user with deploy credentials to EBS)

Add a User, set a name and give programatic access. Then set the permissions, attach existing policies directly. Search for beanstalk and add all the policies.

These will generate a ID Key and Access Secret key.

Open your Travis dashboard, go to your multi-docker project, options, settings and add the aws_secret_key and aws_access_key env variables.

Now we just have to write the deploy script in the travis.yml file.

Travis Deploy Script

deploy:
  edge: true
  provider: elasticbeanstalk
  region: us-east-2
  app: multi-docker
  env: MultiDocker-env
  bucket_name: elasticbeanstalk-us-east-2-709603168636
  bucket_path: docker-multi
  on:
    branch: master
  access_key_id: $AWS_ACCESS_KEY
  secret_access_key: $AWS_SECRET_KEY

“edge” is just to make sure Travis uses the dpl (deploy?) script v2, which doens’t have a bug. region, app, env you can find in the EBS console. The bucket_name and bucket_path are the S3 bucket that Travis uses to zip all the project files. Go to S3 and make sure there is a bucket you can use, and copy the name. The bucket_path is a name you can give.

We also tell to deploy to EBS only when the branch master is builded. And finally put the aws keys we defined in the Travis console.

Container Memory Allocations

In our Dockerrun.aws.json file, we need to put the amount of memory allocation we want for each service. Recommend to search for posts about best practices on memory allocation for containers, but in this project we will set 128mb for each one:

"memory": 128

Verify Deployment

You can see the logs of the deployment in the EBS console. If there is an error, go through the logs and search for it.

Cleaning up AWS Resources

We have to shut down the EBS (ElasticBeanstalk), RDS and EC (ElastiCache).

We don’t have to delete the security group we created because there is no billing. But we will clean up anyway.

Go to EBS, see your application, actions, delete application. It takes a while to terminate. In RDS, go to you databases, click in the checkbox, instance actions and delete it (don’t create snapshots). In EC, go to Redis tab, select you checkbox and delete button. And to clean the security group, go to VPC, secutiry groups, select you checkboxes of multi-docker and actions buttons, delete it. We can also delete our IAM user, the multi-docker-deployer.

Onwards to Kubernetes

Imagine we have to scale up the previous project. We had an EBS instance with Nginx, server, client and a worker container. To scale the app, we needed the worker to scale up, because it’s the worker who does the heavy job of calculations of fibonacci values. But EBS only scale the whole instance, we cannot control the scalability of one container only, we cannot scale only the worker container to 10 worker containers. That’s where Kubernetes come in.

With Kubernetes we have a Kubernetes cluster, which is formed by a Master + Nodes. Each Node is a Virtual Machine or a Physical Machine containing one or more containers, that can be of different types. One Node can have 10 worker containers, 1 worker and 2 client containers, only 1 server container, whatever. The Master is responsible for managing and controling what each Node does.

External services can access the kubernetes cluster through a Load Balancer, who will manage the incoming requests and send them to the Nodes of the cluster.

So Kubernetes works for that case, when we need to scale up different amount of containers, when we have to control how and which containers to scale up.

Kubernetes in Development and Production

It’s very different using kubernetes in development mode and production.

In development, we use a CLI tool called minikube, which sole purpose is to run a kubernetes cluster on my local computer.

In production, we use managed solutions, or 3th party cloud providers like AWS (EKS) and Google Cloud (GKE). We also can set up a environment our own, but managed solutions it’s secure and better.

How Minicube works?

It sets up a Virtual Machine (single node) that runs multiple containers for you, in your local machine. To interact with this single node, we use a tool called kubectl, to manage and access the containers.

Minicube we use only in DEVELOPMENT. Kubectl we use in production too.

We have to install a Virtual Machine driver to run the VM in our local machine. It will be VirtualBox.

Installing Kubectl, Minikube and VMBox

Just followed installation guides

Mapping existing Knowledge

After all the installing, do minikube start, and then minikube status

You can see the server running in our VM.

kubectl cluster-info you can see the kubernetes running.

Our first task is to run the multi-client (the react app for multi project) image in our local kubernetes Cluster as a container.

To do this, we have to:

Each service can have, optionally, a docker-compose to build an image
Each service represents a container
Every service define some networking requirements like PORT mapping.

Into the Kubernetes world, these points would be:

Kubernetes expects all images to be already build. We need to do this outside the kubernetes.
We have multiple configuration files, that will create a diffferent object (that is not necessarily a container)
We have to manually set up all networking. There is no one file to manage PORT mapping, sadly =|

To make sure we can do these points in local Kubernetes:

Make sure our image (multi-client) is built and hosted on docker hub
Create one config file to create our container
Make one config file to set up networking between our container and the outside world

Adding Configuration Files

Let’s create a new project directory and create a new config file. The project is simplek8s

Create client-pod.yaml. The goal of this file is to create our container The other file is client-node-port.yaml, to handle our networking

Object types and API Versions

Now explaining what those files do, in detail:

First, let’s see what the property kind means in these .yaml files.

With Docker Compose, Elastic Beanstalk and Kubernetes, we have to have a config file describing the containers we want. With Kubernetes, we are not creating the containers in the .yaml config file. We are creating an object.

We will take these Kubernetes config files we created and put into kubectl. Kubectl will interpret them and create objects from these files. We are creating specific type of these Objects. It could be “StatefulSet”, “ReplicaController”, “Pod”, “Service”…

So the kind in the config files mean the type of Object we want to create in Kubernetes.

What is an Object? They are things that will be create inside our Kubernetes Cluster. Some objects will be used to run a container, like the Pod type. Other will monitor the container, Service will set up networking in our containers…

Now about apiVersion, it just serves to scope the Object type we have available into the config file. We specified apiVersion: v1, this means we have access to some set of Objects. If we specified apiVersion: apps/v1, we would have acces to other type of Objects.

Running containers in Pods

What is a Pod?

When we ran minicube start, it created a VM. It is a Node. This Node will be used by Kubernetes to run some number of diferent Objects. The most basic Object is a Pod. So when we specify kind: Pod, we are creating a Pod into our Virtual Machine.

Pod is basically a group of container with a very common purpose. In Kubernetes we cannot create a container standing alone into a Node. We need to create a Pod, is the smallest thing we can deploy to run a container in Kubernetes.

But why we would have to make a Pod? In Pod we have to run one or more containers inside it. But the goal of a Pod is to group containers with a very similar purpose, tight relationship. So it’s wrong to just add any containers to a single Pod.

A good example of Pod with multiple containers is a Pod with a Postgres container, and a logger and backup containers as well. The logger and backup are tighly integrated with the Postgres, and could not function without it.

So looking at our file, it makes more sense:

apiVersion: v1
kind: Pod
metadata:
name: client-pod
labels:
component: web
spec:
containers: - name: client
image: ricardohan/multi-client
ports: - containerPort: 3000

We are creating a Pod with one container running on it. The container is named “client”, we can use this name to debug or to create connections with other containers in the Pod. We are also exposing PORT 3000 to the outside world. But this is not the full story to PORT mapping, this line alone will not do anything, it need the second config file for networking.

About the metadata field, the “name” is the name of the Pod. “labels” is coulped with the other config file, the kind: Service.

Services Config files in depth

Another type of Objects is “Services”, sets up networking in a Kubernetes Cluster.

There are subtypes for the Service type:

ClusterIP
NodePort
LoadBalancer
Ingress

Subtypes are specified inside specs.type.

Our exemple is a NodePort service. It exposes a container to the outside world, so we can access the running container in our browser. In production we don’t use it, except a few exceptions. We will talk about the other subtypes later.

So explaining how NodePort works (remember, only development):

Imagine we have a browser request. The request enters the Kubernetes Node through a kube-proxy, that comes with every Node. This proxy job is to redirect the request to the right Service, and then the Service can foward the request to the right Pod.

But how our files communicate to each other? How our Service file knows it has to handle request to our Pod? It uses the label selector system.

In the Pod file, we have inside metadata -> labels -> component: web And in the Service file: spec -> selector -> component: web

That’s how these objects relationship works. The Service looks to Pods that have component: web The key-value pairs are totally arbitrary, but they have to match. So it could be tier: frontend for example.

Now let’s talk about the collection of PORTS we have in the Service.

ports:

- port: 3050
  targetPort: 3000
  nodePort: 31515

The port is the PORT another Pod or another container could access to communicate with our multi-client Pod. The targetPort is the PORT inside the Pod, it means we want to send the traffic to this PORT The nodePort is the PORT we, developers, will access in the browser to access the Pod. (we can specify a number to the range of 30000-32767)

One of the reasons we don’t use nodePort in production because of these weird PORT numbers

Now let’s load these Object files into kubectl tool and try to access the running container

Connecting to Running Containers

kubectl apply -f <filename path>

kubectl: is what we use to manage our kubernetes cluster. apply: means we want to change the configuration of our cluster -f: we want to specify a file

So we will run this command twice, one for Service and one for Pod.

kubectl apply -f client-pod.yaml kubectl apply -f client-node-port.yaml

Let’s check if they created successfully: kubectl get <objectType>

kubectl get pods

It prints:

NAME READY STATUS RESTARTS AGE
client-pod 1/1 Running 0 109s

1/1 means: it has 1 Pod running and 1 copy of this Pod.

kubectl get services

Prints:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
client-node-port NodePort 10.111.105.31 <none> 3050:31515/TCP 3m58s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 147m

The kubernetes is a default.

Well, now our Objects are running into our Cluster. But we cannot acces it on the browser yet. Because our Kubernetes Node running as a VM created by Minikube has to give us an IP address, because we don’t have access to it from the localhost

So in the terminal: minikube ip

This will give us an IP address. We won’t be using localhost when using Kubernetes with minikube

And we have to put the PORT we inserted into nodePort. Something like 172.17.0.2:31515

The entire deployment flow

What happen when we put the config files into kubectl?

Let’s imagine we have a config file with the goal of running 4 instances of multi-worker image. We make the command to deploy this file, using kubectl apply. Then, the file goes to the “Master”, is a general manager of the Kubernetes cluster. Inside it, we have 4 programs, but for simplicity, let’s imagine it has only the “kube-apiserver”

kube-apiserver is responsible for monitoring the Nodes of the Cluster and making sure they are operating okay. It will also interpret the configs files we inserted into it and update the internal list of responsabilities, following our command. Something like:

“I need to run multi-worker, 4 copies, and I’m currently running 0 copies of it”

So it will reach to the running Nodes of the Cluster. Remember, the Nodes are just VMs running Docker. The kube-apiserver will say: hey Node1, run 2 copies of multi-worker, Node2 runs 1 copy and Node3 runs 1 copy. In total we will have 4 copies, as the config file asked.

Each Docker running inside the VMs will reach out to DockerHub and pull the image and create a container of it. The Node1 will create 2 containers.

The “master” will make a check and see everything is good, with 4 copies running. The “Master” is continuosly whatching the Nodes, to check everything’s good. If a container dies, the Master emit a new message to the Node, to run another copy of multi-worker.

All commands pass through the “Master”, and the Master reach out to the Nodes.

The Mater having the list of responsabilities to fullfill and constantly check the Nodes to see if the list is okay, is one of the main concepts of Kubernets

Imperative and Declarative deployments

Important Notes:

Kubernetes is a system to deploy containeireized apps, run containers
A Node is like a physical Machine or VM that runs some number of Objects (or containers)
Master is a machine (or VM) to manage everything that has relation with Nodes
Kubernetes don’t build the images, just get the images from DockerHub
The Master decides where to run each container. We can control it with configuration file, but the default is the Master deciding alone
To deploy something, we update the list of the Master with our config files (like saying, hey Master, need you to create 4 copies of multi-worker)
The master constantly check to see if the list is fullfilled

There are two ways of approaching deployments:

Imperative deployments
Declarative deployments

Imperative giver orders like “Create this container!” Declarative just say to the Master “It would be greate if you could run 4 copies of this container”, just directions

Imperative you have to set every step of the flow. Declarative you just say what you want, and Kubernetes do the rest

When we research the web, we will see lots of people saying to run imperative approach, giving commands and etc.. But Kubernetes is nice enought to have it’s own commands. So be skeptic when you read a blog post to update a Pod in Kubernetes. But we can just update a config file and send it to Kubernetes Master, for him to handle automatically