Docker basics
Old notes about docker basic commands
Why User Docker??
It makes really easy to run software without the trouble of setup and installation
Whats is Docker??
Docker is an ecossystem of tools like:
- Docker client (docker cli)
- Docker server (docker daemon)
- Docker machine
- Docker images
- Docker compose
- Docker hub
The objective is create and run containers
When you run a program with docker, what happens is:
- Docker cli reaches the Docker hub, and it downloads a single file called Image
- Image is a file containing all the dependencies and configs required to run a program
- This image gets stored in your hard drive
- You can use this image to run a container (an instance of an image). Container is like a running program
docker run hello-world
This means we want to start a new container using a image called hello-world First, it searches the hello-world image in our computer, inside image cache. Since it’s empty, it goes to the Docker hub, a free repository with images you can download. Now we download it into our computer and store inside the image cache.
But really, what is a container???
First, an overview of the OS in our computer:
-
Kernel: is a running software process, govern access to all programs running and all physical hardware. Kind of connects these two parts (programs and hardware)
-
The programs running (like nodejs, chrome, spotify…) interact with the Kernel through System calls, which are like functions the Kernel expose. “If you wanna write a file to the hard drive, call this function”. The Kernel exposes these different “endpoint”, or system calls, so your running process can call it and do whatever they want.
Let’s imagine we have two programs running on our machine, one needs Python 2 and the other need Python 3. And our Hard Drive only can have Python 2 installed. Only one of your programs will run.
We can solve this problem with a thing called namespacing
, which will look to our HD and segment specific resources. So in one side we can have Python 2 and in the other we can have Python 3. The kernel will then redirect the System call to the right portion of the HD. We can isolate resources based on what the program is asking for.
Another feature related to namespacing
is control groups (cgroups)
which limits the amount of resource a process can run, the amount of Memory, CPU, HD, Network..
So a Container would be a block containing a running process and the resources assigned to it. The program, the kernel and the necessary things to run you program, like RAM, Network, CPU and HD. A container is not a physical thing, is more abstract.
An Image is basically a file system snapshot, a copy of a very specific set of directories and files. We can have an image with Chrome and Python. An Image also have a start-up command. When we run the image, the kernel reserves a portion of the HD for the image, in this case, Chrome and Python would be there.
How Docker is running in your computer
namespacing
and controlgroups
belongs to Linux OS. But how MacOS and Windows run docker?
You installed docker, and along that, installed Linux Virtual Machine. Is inside the VM that containers will be created. It will be a Linux Kernel hosting running processes inside containers. This Linux Kernel will be responsible for limiting access to the HD and etc..
Docker run in detail
docker run <image name>
uses an image to create a container.
When we run docker run hello-world
, there is a file image in our Hard Disk, with a filesystem snapshot, with only one program inside of it: hello-world. When we run, this snapshot goes to the Hard Drive and a container is separated for that. Then, after the container is created, the command inside the image starts the running process, it calls the kernel and the kernel talks with the needed resources.
Overwriting default commands
We can do: docker run <image name> <your command>
your command
will overwrite the default command inside the image.
docker run busybox echo hi there
, the command is echo hi there
and will overwrite the start command.
If you do docker run busybox ls
, you will see:
bin
dev
etc
home
proc
root
sys
tmp
usr
var
These folders are not your computer folders. They come from the snapshot of the image and were placed inside your Hard Drive, for your container.
These overwrite commands (or startup commands) only works because the image we are calling has a snapshot of files that contain the commands executables. So if you try to do a docker run hello-world echo hi there
, it will not work because you tried to use echo
inside the hello-world image files systems. And the only thing that exists inside the hello-world image is one single file, to print a hello world.
In the docker run busybox
it worked with ls
command because the file system of busybox image has the ls
command,somewhere inside de bin, or another directory.
Listing running containers
docker ps
List all different containers running on your machine.
docker ps --all
All containers we run before.
Container lifecycle
docker run
can be splitted into two different commands:
docker create
and docker start
.
docker create
is just about getting the filesystem snapshot from the image and getting ready into the container.
docker start
is to execute the start command.
➜ docker create hello-world
f370e82d1c4980d68126ae708781ce0fc2d2a1a28e8809c97f62e3d6235c403b
➜ docker start -a f370e82d1c4980d68126ae708781ce0fc2d2a1a28e8809c97f62e3d6235c403b
The -a
will attach to the container and give the output for that container
Restarting Stopped Containers
When we do docker ps --all
, we see the EXITED containers but can still use them again.
docker start -a <containerID>
We cannot replace the default command when restarting the container as we did above. It will run the default command again.
Removing Stopped Containers
docker system prune
It will delete all stopped container plus other things, like your build cache, which is all images you’ve downloaded from dockerhub.
Retrieving Log Outputs
If you forget to attach -a
to look at the container output, you can use docker logs <conatinerID>
to retrieve all information from it.
docker logs
don’t restart the container, just retrieve the outputs.
Stopping containers
docker stop <containerID>
With stop we emit a hardware signal, SIGTERM message, terminate signal, to stop a process and the container have some time to do a cleanup.
docker kill <containerID>
Emits a SIGKILL command, the container has to shut down right now, without any cleanup.
If you run a docker stop and the container don’t stop in 10 seconds, docker will fallback to docker kill.
Multi-Command Containers
For example when we run a docker run redis
, we start the redis server inside docker.
To interact with it, we have to use the redis-cli, but it has to be inside that container.
We can do it with docker exec
, to execute an addittional command inside a container.
docker exec -it <containerID> <command>
it
allow us to type an input directly in the container.
<command>
is the command we want to execute inside the container.
➜ docker exec -it 8934b3cb6923 redis-cli
The purpose of the it
flag
When you are running docker on your machine, every container is running inside a Linux VM.
Every process we create in a Linux environment has 3 different comunication channels:
- STDIN
- STDOUT
- STDERR
Are used to comunicate infos into the process (STDIN) or out of the process (STDOUT).
STDIN is stuff you type. STDOUT is stuff that shows on your screen. STDERR is stuff that shows on your screen as an error.
The it
is two separate flags, -i
and -t
.
docker exec -i -t
is same as docker exec -it
-i
means we want to attach this new exec
into the STDIN channel, so we can type things to the running process.
-t
is what display the text nicely formatted.
Getting a command Prompt in a container
We often will want to open a shell, or a terminal, into our docker container, so we can interact with it and doesn’t have to type docker exec
all the time.
docker exec -it <containerID> sh
The sh
is the magic. Now we’re in a docker container and can execute any unix commands. Full terminal access, excelent for debugging.
sh
is the name of a programm, a shell. There are several command processors:
- bash
- powershell
- zsh
- sh
So sh
is basically another program, to execute commands.
Starting with a shell
We could start a shell right after we start a docker container: docker run -it <imageName> sh
.
docker run -it busybox sh
The downside is that you won’t be able to run any other process. It is more common to start a container with a primary process and then attach a running shell with docker exec -it ...
Container isolation
It has to be clear that two container do NOT share the same file systems.
Creating Docker Images
To do so, we will:
- Create a dockerfile: a plain text file of configs, define how our container behaves, what programs it contains and what it does when starts.
- Docker client: it’s the docker CLI, gets the file and passes to Docker Server
- Docker server: Will do the heavy leafiting, will build a usable image that can be used to start a new container.
To create a dockerfile:
- Specify a base image
- Run some commands to add dependencies or other programs
- Specify a startup command, to boot up a container.
Building a dockerfile
The goal now is to create an image that runs redis-server.
So let’s write the code and explain after:
- Create a directory like “redis-image”
- Create a
Dockerfile
file, without extension and first letter uppercase - Inside the file:
# Use an existing docker image as base
FROM alpine
# Download and install a dependency
RUN apk add --update redis
# Tell the image what to do when starts as a container
CMD ["redis-server"]
- Go to terminal, inside the directory and:
docker build .
// "docker build ." will generate an ID
docker run <generatedID>
That’s it, we are running a redis-server with our own custom image
Dockerfile teardown
Every line from the Dockerfile starts with an instruction. It’s telling the docker server a specific preparation step on the image.
FROM
specify the docker image we want to use as a base.
RUN
execute some command while we are building our custom image
CMD
what should be executed when our image starts a container.
These instructions are the most used and important.
What is a Base Image
Let’s start with an analogy: imagine you have an empty computer without an OS and have to install Chrome. You would first install any OS, then download a browser, go to the chrome page and download it, etc..
The FROM alpine
is just like installing an OS into our empty Dockerfile. The Dockerfile has nothing, it’s just empty, so we need a starting point.
We are using alpine
because it has a set of programs we need to start building our Dockerfile, just like Windows or MacOS has it’s own set of programs.
In this case, alpine
helps us with the second command we runned: RUN apk add --update redis
.
The apk
is an apache package available in the alpine
, and we used it to install Redis. So apk add --update redis
was only possible because we installed alpine
before.
The Build Process in Detail
When we executed docker build .
, we are getting our Dockerfile and generating an image out of it. We serve our Dockerfile to the docker CLI.
The .
is the build context, a set of files and folders we want to encapsulate to build our container. We will se more later on.
When we do docker build .
:
- We create a temporary container out of
FROM alpine
command. - We run the
RUN apk add --update redis
command inside the temporary container from alpine. - This command install redis in this temporary container.
- We take a snapshot of the container file system with the installed Redis.
- The temporary container is removed.
- With
CMD ["redis-server"]
, we look at the previous built image snapshot (with Redis) and buid a container out of it. - The
CMD ["redis-server"]
puts the redis-server as the primary startup command for this container, then it takes a snapshot out of it and also shuts down this container. - The end result is a final image with a filesystem snapshot with Redis and the startup command (redis-server)
Rebuilds with Cache
If we run docker build .
again, it will use cache because docker see the Dockerfile commands didn’t change, and will retrieve the last result from cache of my local machine.
But the cache follow the order of the instructions, so if you change the order it will not use cache.
Try to always put new commands in your Dockerfile in the last parts of it, because Docker will only run the changed parts and will get from cache everything above.
Tagging an Image
When we do docker build .
it will generate an image ID, so we can run the container by doing docker run <imageID>
.
We can specify a name for this image, so we don’t need to reference it’s ID everytime we want to run the container.
We do so by docker build -t <tag the image> .
By convention, <tag the image> is:
- your docker ID/project name:version
So will be docker build -t stephengrider/redis:latest .
Now we can start the container by doing
docker run stephengrider/redis
Don’t need to specify the version.
Manual Image Generation with Docker Commit
As we saw before, we can create an image from a container, creating snapshots of the container. Just like the temporary container do.
So we can both create a container from an image and vice versa.
Let’s see how manually create an image from the container, just for the sake of curiosity.
docker run -it alpine sh
The command above start a container with alpine, sh
for starting the shell. Now inside the container shell:
apk add --update redis
Now we installed redis in that running container.
In another terminal, we will create a snapshot of that container and assign a startup command to it:
docker ps
docker commit -c 'CMD ["redis-server"]' <containerID>
The output is the ID of our new image we just created.
We can do docker run <imageID>
and it will start redis server.
As you saw, there is a nice relation between images and container!
04 - Making real projects with Docker
Project Outline
- We will create a simple NodeJS web app
- Create a Dockefile, to create an image
- Build image from the dockerfile
- Run the image as a container on our local machine (to run our web app inside a container)
- Connect from the browser to the web app
DISCLAIMER
The following lessons will have some mistakes, to teach you
NodeJS Server setup
A very simple NodeJS app
const express = require("express");
const app = express();
app.get("/", (req, res) => {
res.send("Hi There");
});
app.listen(8080, () => {
console.log("listening on PORT 8080");
});
A Few planned errors
The steps we have to make to start a NodeJS app is npm install
and npm start
, assuming we have npm installed.
To build our Dockerfile, the steps are similar to the previous time:
- Specify a base image
- Run commands to install adittional programs
- Specify a command to run on a container start up
So let’s create a Dockerfile
and:
# Specify a base image
FROM alpine
# Install dependencies
RUN npm install
# Default commands
CMD ["npm", "start"]
Let’s run that on our docker by doing: docker build .
. It will give us an error npm: not found
Base image issues
We saw the previous errors because we used the base image from Alpine
, which is very light and don’t have much. The npm is not available in our Alpine base image.
We could: Search another base image or install npm in the Alpine.
Let’s use another base image. Go to dockerhub (hub.docker.com) and search the node
official repository.
This repository is a docker image with NodeJS pre-installed on it. We could build our image based on this Node repository.
In the dockerhub there are the versions of nodejs (in this case) we could use to build our image. In our dockerfile, we would write the version by doing FROM node:{{version}}
. Let’s use the alpine version of this node repository, which means the most compact and lightweight.
FROM node:alpine
Now if we run docker build .
again, it will pass the first step but crashes the second step when running npm install: ENOENT: no such file or directory, open '/package.json'
A few missing files
We have the package.json file in our directory, but the error says we don’t have it.
It’s because when we run RUN npm install
, we get the node:alpine image File system snapshot and put in a container. There is no package.json
inside this container, only the files from the node:alpine snapshot.
When you are building an image, the files inside your local directory will not be available inside your container, you have to explicity put them in the container.
Copying build files
We will use the COPY
command, which will move local files in our directory to the docker container at our build process. The syntax is:
COPY {{ path to folder/file to copy on your local machine relative to the build context }} {{ place to copy stuff inside the container }}
build context is the dot in Docker build .
for example
# Specify a base image
FROM node:alpine
# Install dependencies
COPY ./ ./
RUN npm install
# Default commands
CMD ["npm", "start"]
Now if you build your image it will give some warnings for package-lock.json
but it will generate your image ID sucessfully.
Let’s generate a built image building with tag, instead of generating an ID:
docker build -t ricardohan/first-nodejs-app .
ricardohan is my dockerID from dockerHub
The last thing to do is start up the image and get the server running.
docker run ricardohan/first-nodejs-app
Container PORT Mapping
We can see that our node server is running in PORT 8080, but if we try to access in our browser, we don’t see it. It’s because we are trying to access our local machine PORT 8080, but the container running the node server has it’s own set of PORTS. So we have to map the local PORT to the container PORT.
We don’t do it in the dockerfile, we will do it in the build command, at runtime because it’s a runtime constraint, we only change it when we start the container.
We will add docker run -p <route incoming requests on local machine>:<port inside container> <image id / name>
.
So: docker run -p 8080:8080 ricardohan/first-nodejs-app
Now it works!!
You can change the PORT mapping:
docker run -p 5000:8080 ricardohan/first-nodejs-app
Now the container still runs the node server in the 8080, but your local machine is running on 5000
Specifying a working directory
Let’s start a shell inside the container for debugging:
docker run -it ricardohan/first-nodejs-app sh
We will get a command prompt in the root directory of the container.
Inside the prompt, if you type ls
we see all the files and folders, including our package.json, index.js, node_modules, etc.. together with the base folders from the container such as var, lib, etc..
This is not a good practice, so we will create a specific directory to contain all our project folder and files.
Let’s change the Dockerfile for that.
WORKDIR <folder path>
All instructions after the WORKDIR
will be relative to the folder path we specified above.
Lets’ put WORKDIR /usr/app
above the COPY
instruction, so the working directory will be /usr/app.
If the directory doens’t exist, it will create one.
/usr is a good place to put our files, as /home or /var are also okay
Now if we can build our container again:
docker build -t ricardohan/first-nodejs-app .
And run it:
docker run -p 8080:8080 ricardohan/first-nodejs-app
In another terminal, let’s do a docker exec
to start the shell in the running container:
docker ps
docker exec -it <container id> sh
We entered directly inside /usr/app
directory.
Unnecessary Rebuilds
If we change some file of our local project, we need to rebuild the whole docker container so the changes appears. We will see more of this later, but for now, let’s try to optimize the build.
If we make some change to a file, the docker will see it and COPY
all the files again. So everything after the COPY
command will run again, including the npm install
command. This is bad, we don’t have to install all the dependencies again right?
So let’s fix this.
Minimizing Cache Busting and Rebuilds
We will divide the COPY command in two steps:
# Install dependencies
COPY ./package.json ./
RUN npm install
COPY ./ ./
So first we are copying only the package.json file into the working directory. So npm install will only run if we change something to package.json file.
After that, we can COPY all the other files to the working directory.
Now, if we change some file of our project, the only line that will be changed is the COPY ./ ./
in our Dockerfile. That’s why we put it after the RUN npm install
command.
It does make a difference the order of your commands in your Dockerfile.
5 - Docker Compose with Multiple Local Containers
App Overview
Make a container that has a web app which display the number of visits the web app receives.
We need a web server and Redis to store the number of visits. We could store inside the server but let’s make it more complex.
Later we will scale the node servers, but still mantain only one Redis server for storing the numebr of visits.
App server code
Code is inside /visits/index.js
Assembling a Dockerfile
This Dockerfile will handle the server side of the application, it will have nothing to do with Redis.
FROM node:alpine
WORKDIR '/app'
COPY package.json .
RUN npm install
COPY . .
CMD ['npm', 'start']
Basically same code from previous app.
Build this image with docker build .
Actually, we can tag it with docker build -t ricardohan/visits:latest .
. Now instead of the ID we can reference this container by the tagged name.
Introducing docker-compose
Now we builded the image, we can run the container with docker run ricardohan/visits
.
We will get an ERROR message because there is no Redis server to connect to our server.
To run a Redis server we can do simply docker run redis
. It will get the image from dockerhub and start the copy of redis in our local machine.
In another terminal, if we try to run our web server, we will get the same error saying we cannot connect to the redis server.
This happens because there are two separate containers running in our machine: the node server and redis. They do not communicate automatically between each other, so we have to set a network communication.
We can do this in two options: using docker CLI network feature in the terminal or use docker compose.
Docker CLI is a pain in the ass, we need to re-run the same commands every time.
So let’s use docker-compose, it’s a separate CLI tool. It exists so we don’t have to type a lot of commands and options in Docker CLI, repetitive things.
It’s very easy to start multiple docker containers and connect them.
Docker compose files
We will essentialy write the same commands we runned before (docker build…, docker run…) and write in a docker-compose.yml
file. We will feed this file into docker-compose CLI and it will handle it.
In docker-compose.yml we will specify the containers we want to build, specify the images we want to use to create them, and map the ports of the containers to our local machine.
version: "3"
services:
redis-server:
image: "redis"
node-app:
build: .
ports:
- "4001: 8081"
We specified the version of docker-compose to use.
The inside services
, we specify all the containers to run and the images to build from.
Since the node-app
is built locally using the Dockerfile, instead of specify an image we specify how to build the Dockerfile. build: .
means search for a Dockerfile in the current directory and build it.
Then we mapped the PORT 8081 of the container to our local 4001.
Networking with docker-compose
By defining the two containers in the docker-compose, it automatically connects the in the same network. So we don’t have to connect them manually.
But how to access the redis server from our nodejs app?
const client = redis.createClient({
host: 'redis-server',
port: 6379
});
Usually if we are not using docker, we would put a URL inside the host
, something like https://myredisurl.com
.
But since we are using docker, we can specify the name of the redis container in docker-compose.
When redis.createClient tries to connect to ‘redis-server’, the docker will see it and redirect this connection to the redis container. The default PORT of redis is 6379
.
Docker-compose commands
Before we typed docker run <image name>
, but now it’s docker-compose up
. We don’t have to specify any image name because it’s already in the docker-compose file.
Before we had to build our image and then run it. But now we can do it in a single command:
docker-compose up --build
Stoping docker-compose containers
Before in Docker CLI we could start a container in the background with docker run -d redis
.
The we could do docker ps
, see all running container and stop them with docker stop <container ID>
.
In docker-compose we can do docker-compose up -d
to launch all containers in the background and docker-compose down
to stop all of them.
Containers Maintenance with Docker-compose
How to deal with services that crash or have an error.
If our nodejs app has a process.exit(0)
and crashes, our container will stop.
Automatic container restart
process.exit(0)
, 0 is a status code meaning we exited the app but everything is OK. This affects how docker stops our containers.
We will specify restart policies to our docker-compose.
We have 4 restart policies:
- “no”: is the default to all containers, if a container crashes never attempt to restart it.
- always: is a container crashes, always tries to restart it. Good for web servers, always have to stay up and running.
- on-failure: only restart a container if it stops with an error code (other than 0). Good for worker processors.
- unless-stopped: always restart unless we developers forcibly stop it.
version: "3"
services:
redis-server:
image: "redis"
node-app:
restart: always
build: .
ports:
- "4001:8081"
Container status with docker-compose
docker-compose ps
Creating a Production-Grade Workflow
We will learn how to develop, test, deploy and redo everything again.
Flow specifics
We will work in a feature branch in github. The open a PR to master, run travisCI and if all tests are fine, will automatcally push to the AWS.
Dockers purpose
Docker makes some of those tasks really easy, but it’s not necesssary to the flow.
Project generation
We will make a React project and run in a container
Create a react project with create-react-app
Creating the Dev Dockerfile
We will wrap our create-react-app inside a Docker container for development purposes.
We will have 2 Dockerfiles: one from development and one for production
Let’s create the development Dockerfile. It’s a Dockerfile.dev
file.
FROM node:alpine
WORKDIR '/app'
COPY package.json .
RUN npm install
COPY . .
CMD ["npm", "run", "start"]
Go to the terminal and let’s build our Dockerfile.
Since we are using a Dockerfile.dev
, we can’t use the command docker build .
because it will search for a Dockerfile file. So to build a custom Dockerfile, we do docker build -f Dockerfile.dev
. The flag -f
is to specify the file name we want to build.
Duplicating dependencies
You will notice we get a message the first time we build it: Sending build context to Docker daemon 155.2MB
. That’s because create-react-app already installed the dependencies (node_modules) in our working directory, so Docker is copying everything to the container again. We don’t need the dependencies locally, so we can exclude them, since we are installing everything in Docker.
If we build again, it will go much faster.
Starting the container
Remember we have to expose the docker container PORT to our localhost.
docker run -p 3000:3000 <containerID>
Docker Volumes
Now that our react app is running in our container, let’s try to have a hot-reload feature: every time we change something in code, we want to see the changes reflected in our running container.
Remember in our Dockerfile.dev, we are using the COPY . .
, copying everything in our directory into the container.
We have to change this approach and change the way we run our container. We will use a feature called Docker Volumes.
Docker Volumes don’t copy your files (/src and /public and whatever else) into the container, it makes a reference of those files into the container. It’s like mapping the PORTS of Docker into our localhost, but this time we are mapping the local files into the container.
The syntax to run the container with volumes is:
docker run -p 3000:3000 -v /app/node_modules -v $(pwd):/app <containerID>
Start explaining the -v $(pwd):/app
part:
pwd
is the path of present working directory. It’s a shortcut for (in my case)/home/ricardo/Documentos/docker/frontend
.- It basically says, get everything in my present working directory and map to /app folder in the container.
Bookmarking Volumes
Since we are mapping everything in our pwd (present current directory) to the /app folder in the container, we are replacing the node_modules
of the container too. But we don’t have a node_modules in our local working directory (we deleted it). So:
The -v /app/node_modules
command says:
- Don’t do anything to /app/node_modules, just use it.
- We are basically putting a bookmark on the node_modules of the container (it’s the /app/node_modules)
In Docker, the :
means: map something inside the container to something outside the container.
Now if we run the command to run the container with all the arguments, we will be able to access localhost:3000 and make changes to the files, and see the changes in the browser.
Note: hot reload is not a feature of Docker volume, it’s a feature of create-react-app. But it’s possible because of the Docker Volume, since we are running the app in a docker container.
Shorthand with Docker Compose
The Docker volume is cool but it’s a pain to type the whole command docker run -p 3000:3000 -v /app/node_modules -v \$(pwd):/app <containerID>
everytime.
So let’s use docker-compose.yml
for that:
version: "3"
services:
web:
build:
context: .
dockerfile: Dockerfile.dev
ports:
- "3000:3000"
volumes:
- /app/node_modules
- .:/app
The build command, in the previous docker-compose file we did just build: .
. But since build command looks for a Dockerfile
and we are working with a Dockerfile.dev
, we had to change things here.
We specified the context: .
which is the directory we want to work with in our container, and dockerfile: Dockerfile.dev
is the file we want to use for building our image.
The - .:/app
line says: map the /app
of the container to my current working directory .
Now just by doing docker-compose up
we can run our container with the volumes working.
Do We Need Copy
We could just delete the COPY . .
inside our Dockerfile.dev, because we are using docker volumes, so our container just reference all the file from our local working directory. But is good to leave it there because we may stop using docker-compose in the future, or we could use this Dockerfile.dev to build out production Dockerfile, so it’s better to leave it like this.
Executing tests
First we are going to run test locally, then run the tests in TravisCI, for deployement mode.
First build your image: docker build -f Dockerfile.dev .
Then we run a specific command to start up the container, in the end of docker run
:
docker run -it <containerID> npm run test
The npm run test will overwrite the default startup command.
The it
flag is better to see the logs and you can also insert inputs.
Live updating tests
We have the same problem we had before: when updating our test files, it won’t update in docker container.
We have two approachs for that:
- create a new service wth volumes in docker-compose.yml file
- use the same service we have in docker-compose, start the container, attach to it and run our tests in a separate terminal.
Let’s do the second approach now:
Start your docker-compose container with docker-compose up
.
In another terminal:
docker ps
to get the container IDdocker exec -it <containerID> npm run test
Now you are attaching the running container and you can modify your local tests, it will update the tests.
Docker Compose for running tests
Now let’s do the first approach to run our tests in the container. Let’s add the test service in docker-compose.
version: "3"
services:
web:
build:
context: .
dockerfile: Dockerfile.dev
ports:
- "3000:3000"
volumes:
- /app/node_modules
- .:/app
tests:
build:
context: .
dockerfile: Dockerfile.dev
volumes:
- /app/node_modules
- .:/app
command: ["npm", "run", "test"]
We added a second service to the docker-compose, so when we start it, it will build two services.
The command
line indicates which start up command we want to use, it overwrite the default startup command.
Let’s start the docker-compose with docker-compose run --build
so we also build the image again, since we added a new service.
The downside now is that we can’t interact with the test service, it’s logging in our terminal but we are not attached to it.
Shortcomings on testing
We cannot interact with the test service because there is not stdin (standart input) connection from our local terminal to the container. Let’s open a second terminal and attach to the test container. Remember, attach makes the stdin possible, but won’t be able to write the test inputs.
If we inspect the running processes of the container with docker exec -it <containerID> sh
, and inside do a ps
command, we will see:
PID USER TIME COMMAND
1 root 0:00 npm
24 root 0:00 node /app/node_modules/.bin/react-scripts start
31 root 0:12 node /app/node_modules/react-scripts/scripts/start.js
71 root 0:00 sh
76 root 0:00 ps
When we attach, we always attach to the main process, the PID 1
.
When we run npm run test
, what actually runs it’s the npm
process above. It sees the additional args we passed (run test
) and decide what to do. It starts one other process to run our test suit.
The problem is that we always attach to the main process, and the process that understands our test suit commands such as p
, q
and other test inputs, are running in another process. Well, sad we cannot attach to it.
Need for Nginx
Let’s start thinking about our production environment, to run npm run build
.
When we are in development mode, we have a Development Server inside our container, serving our html and javascript files to the frontend, when we access localhost:3000. This development server watches for changes, and do nice stuff for development purposes.
In production this Dev Server disappears, because it’s not needed. When we run npm run build
, we generate static files of html and javascript, and we have to serve them to the client, through a web server.
So we will use Nginx, a very popular web server which sole purpose it’s routing and serving files to client.
We will create a separate Dockerfile
for our production environment.
Multi Step Docker Builds
Our flow will be basically:
- Use node:alpine
- Copy package.json file (in order to do npm run build)
- Install dependencies
- Run npm run build
- Start Nginx
The step of Installing dependencies is only necessary to run npm run build
. We will generate the build directory and then we won’t need the dependencies anymore.
The last step of starting Nginx, we want to use another base image besides node:alpine, which is nginx.
We will use two base images then, with something called Multi step build process
In the Dockefile, we will have two block of configurations:
- One is the build phase, using node:alpine as base image. We will install dependencies and generate the build files.
- Another is the run phase, using nginx as base image. We will copy only the result of build phase (build files) and then start nginx, serving the build files.
Implementing Multi-Step Builds
Create a Dockerfile
:
FROM node:alpine as builder
WORKDIR '/app'
COPY package.json .
RUN npm install
COPY . .
RUN npm run build
FROM nginx
COPY --from=builder /app/build /usr/share/nginx/html
We have two blocks of configs, each block has a FROM
statement. With FROM
, it automatically stops one phase and starts another.
We tagged the node:alpine to be know as builder
phase.
So in the nginx phase, we reference the builder in --from=builder
because we essentially are saying:
Get the /app/build from the builder phase and put in /usr/share/nginx/html directory, in the nginx container.
This /usr/share/nginx/html
path is in nginx documentation.
Running Nginx
In the terminal, build the image docker build .
Nice!
We builded the production ready image, now let’s run this image, mapping the correct ports:
docker run -p 8080:80 <imageID>
- 80 is the default port of Nginx
Nice, we see the content in port 8080, so now let’s think how to deploy to the outside world.
Continuous Integration and Deployment with AWS
Services Overview
We have containers for runnning npm run start
and npm run test
for development and npm run build
in production.
Now we have to implement the continuous integration flow:
- Feature branch to push code in Github
- Master branch to open a PR to
- Integration with TravisCI to run tests and deploy to AWS hosting
Github setup
Just create a public branch and push your local code there.
Travis CI Setup
Watch our new code we push to Github and do some work. We can do anything with the code, test, deploy, anything.
We are going to use Travis to test and automatically deploy to AWS if the code pass the test.
Login in Travis with Github and add your docker-react
repository to travis.
Travis YML File Config
We have to be explicit to say what travis should do.
Let’s create a .travis.yml
file in the root of project directory, to say:
- tell Travis to run a copy of docker, build an image and create a container
- use Dockerfile.dev to build an image
- run a test suite
- how to deploy our code to AWS
In .travis.yml
:
sudo: required
services:
- docker
before_install:
- docker build -t ricardohan/docker-react -f Dockerfile.dev .
First we have to have sudo permission to manipulate docker.
Then we make it clear we need a copy of docker to run this
Then we define a section before_install
, to run steps that need to run before we run our tests.
We are building and tagging the container.
A Touch More Travis Setup
In the .travis.yml
, let’s tell travis how run our tests
sudo: required
services:
- docker
before_install:
- docker build -t ricardohan/docker-react -f Dockerfile.dev .
script:
- docker run -e CI=true ricardohan/docker-react npm run test
script contains all commands to run our test suite. We are running our container with the npm run test command. The -- --coverage
is because npm run test default behavior is to run and wait for our input to finish the test. The —coverage flag exits the test after it runs, and return a coverage result. Actually, Travis CI doesn’t care about the coverage result, just cares about the return code to be 0.
Automatic Build Creation
Let’s commit this changes to github, and let Travis work.
In the dashboard of Travis we should see the logs and sucessfull build of Travis
AWS Elastic Beanstalk
Let’s setup an automatic deploy to AWS, using ElasticBeanstalk.
Go to AWS dashboard, ElasticBeanstalk, create a new application and a new environment, select docker as environment and create it.
important: after we finish the course, we have to close down the application in AWS, otherwise we will be billed.
More on Elastic Beanstalk
How EBS works:
- It has a load balancer in front of a VM running our Docker container
- The VM are scaled up if needed
- So every time a user enters our application, the load balancer choose the best Virtual Machine to send the user to.
That’s the benefit of EBS: it scales automatically.
Travis Config for deployment
In .travisci.yml:
sudo: required
services:
- docker
before_install:
- docker build -t ricardohan/docker-react -f Dockerfile.dev .
script:
- docker run -e CI=true ricardohan/docker-react npm run test
deploy:
provider: elasticbeanstalk
region: "us-east-2"
app: "docker-react"
env: "DockerReact-env"
bucket_name: "elasticbeanstalk-us-east-2-709603168636"
bucket_path: "docker-react"
on:
branch: master
We specified a new block called deploy. The provider
tells Travis which service we are using for deploy.
region
depends on where we create our EBS instance. See the EBS generated URL and see it.name
is the name of your EBS app.env
is the ENV name of the app, specified in your EBS dashboard.bucket_name
is like this: when EBS is created, AWS automatically creates a S3 bucket in our account, which is basically a Hard Drive to store things. When Travis wants to deploy our app, it zips all of our files into one and put in the generated S3 bucket. So this is the name of this bucket. To discover, just go to S3 in your AWS dashboard. The name would be something likeelasticbeanstalk.<aws-region>.27e7r2
bucket_path
: this S3 bucket created automatically for elasticbeanstalk is used for every EBS app we have. Inside that bucket, there are folders with the name of each EBS app. So the bucket_path is the name of your EBS app. The folder inside S3 bucket will only be created the first time you do a deploy.- Finally, we are saying to just deploy the app to EBS when code changes in the master branch.
Automated Deployments
We have to configure API keys to let Travis CI have access to our AWS account, and be able to deploy things.
Let’s search for IAM
in the AWS dash. IAM is the service to manage API keys used by outside services.
Go to -> Users (to generate a new User used by travisCI), and add a new user.
Let’s give a descriptive name like docker-react-travis-ci
and choose Programatic access
in the access type.
Then we have to give permissions to this user, permissions as policies. We want to give access to EBS, so search for elasticbeanstalk and give full access to this user in EBS.
The you create a user. This generates API keys to be used in TravisCI. The secret access key is showed only one time, so write it down somewhere.
This example keys are:
ID key: AKIA2KN5FCF6PR5R7GFX Secret access key: LtxR2OcKGZNPNgfG4QQSn66D4AdenpkGD799EYEW
Grab your Access Key ID and the secret access key. Let’s not put them directly in the yml file, let’s use a environment feature that TravisCI has. Go to travis dashboard, in your project go to more options -> settings -> env variables.
Choose a name and paste your key.
Then in your yml file:
deploy:
provider: elasticbeanstalk
region: "us-east-2"
app: "docker-react"
env: "DockerReact-env"
bucket_name: "elasticbeanstalk-us-east-2-709603168636"
bucket_path: "docker-react"
on:
branch: master
access_key_id: $AWS_ACCESS_KEY
secret_access_key:
secure: "$AWS_SECRET_KEY"
The variable names like $AWS_ACCESS_KEY
and $AWS_SECRET_KEY
are the names we gave in the TravisCi dashboard to the api keys.
Let’s commit and push to the master branch, and let’s hope it gets deployed.
Exposing Ports Through the Dockerfile
We notice after pushing our code, Travis CI runs and deploying our app, that in the EBS dashboard, if we try to access the app, it won’t load. That’s because we didn’t expose the PORTS of our app. To do so with EBS:
In our Dockerfile:
FROM nginx
EXPOSE 80
COPY --from=builder /app/build /usr/share/nginx/html
We added the EXPOSE 80
so ElasticBeanStalk understand which PORT to expose for us.
For us developers, this EXPOSE makes no difference, is only for EBS.
Workflow with Github
Creates a feature branch, make changes, push into feature branch and then merge to master.
Redeploying on pull request
When we merge that PR to master, TravisCi will redeploy the app to EBS.
Deployment Wrapup
Docker is not needded for this flow, but it facilitates much! We set this pipeline one time and we are done!
Building a Multi-Container Application (IMPORTANT STUFF)
Single Container Deployments Issues
The previous app was a single React App, no outside services like databases for example. Also, we builded the image multiple times, one in Travis CI and other in our server at AWS ElasticBeanstalk.
So now we are going to tackle these issues.
Application Overview
It’s a Fibonacci calculator, with server calculating the value for an index value.
Application Architecture
The user will access a url route, this triggers a request to our Nginx server, which sole purpose is to do some routing. If teh request is trying to access a file like HTML, Nginx will route to our React server. If the user is trying to access our backend API to change and submitt numbers, it will route to our express server.
The express server will communicate with Redis (to display the previous calculated values) and Postgres (to save values the user have used before, or submitted)
We are just doing this to use both Redis and Postgres, just for learning purposes.
Postgres will store the indexes (numbers) calculated before, and Redis will store the numbers and calculate the Fibonacci values, using a worker. The worker is a separate NodeJS sever that will watch Redis, get the numbers stores and do the Fibonacci calculations, and return to Redis.
Worker Process setup
The setup is in the /complex/worker folder, we are creating a Redis client and subscribing to insert. Every time there is an insert, we call the sub.on
and run the fib function, and insert the values into the values
hash with hset
Express API Setup
Setup is in /complex/server
Connecting to Postgres
Setup is in /complex/server/index.js
More Express API Setup
Make sure the express has connection to Redis that we created before
Fetching Data in the react app
Calling the routes we defined, to get all indexes and the fibonacci values
Evertyhing in /complex/client/Fib.js
Rendering Logic in App
Evertyhing in /complex/client/Fib.js
Routing the react app
Linking the other page with the Fibonacci page
We have to make sure the Nginx get the correct html5 routing. We talk about it later
Dockerizing Multiple Services
Dockerizing a React App - Again
Let’s start adding docker containers to each service (server, worker and client), so we can run them in a development environment (creating dev Dockerfile for each one)
We are focusing on development versions first, to make sure our workflow is smooth. This means every time a piece of code changes in client, server or worker, we don’t want to rebuild the image again to see th changes.
We will do, in each project:
- Copy over package.json
- Run npm install
- Copy over everything else
- Create a docker-compose to set up volumes to each of these projects, so we share the source code inside each project. This will make sure we don’t rebuild our image entirely in every little change.
Let’s start with the React project, inside the /client directory.
- Create a Dockerfile.dev
- Fill the content:
FROM node:alpine
WORKDIR '/app'
COPY ./package.json ./
RUN npm install
COPY . .
CMD ["npm", "run", "start"]
We are creating an image, just the way we created the frontend project before.
- In the terminal, build the image created:
docker build -f Dockerfile.dev .
- After the successfull build, get the container id and:
docker run {containerID}
See the project running =)
Dockerinzing Generic Node App
Now let’s dockerize the server directory.
Define a Dockerfile.dev into /server directory and:
FROM node:alpine
WORKDIR "/app"
COPY ./package.json ./
RUN npm install
COPY . .
CMD ["npm", "run", "dev"]
Define the same Dockerfile.dev into the /worker directory also. With the same content!
Then build the image (in the /server or /worker dir): docker build -f Dockerfile.dev .
to see if everything is working. It will give ERRCONNECTION error because we don’t have Postgres yet.
Adding Postgres as a Service
Now we have Dockerfile.dev for each project, let’s put together a docker-compose file. The puprose of docker-compose is to make it easier to startup the containers with the right arguments and configurations (expose the PORTS, connect server and worker to services like Redis and Postgres, define environment variables to the projects, etc…)
First, let’s put together the Express server, the Postgres and the Redis.
What we need to thing before add things to docker-compose:
- What images to use?
- What Dockerfile to use (depending on environment)
- Specify the volumes so the container change when the code changes
- Specify env variables
Inside the root project dir (/complex), create a docker-compose.yml file:
version: "3"
services:
postgres:
image: "postgres:latest"
environment:
- POSTGRES_PASSWORD=postgres_password
We specified the first service, called postgres (the name we gave). And inside that service, we specifies the image to use, from DockerHub, which is postgres:latest. Always go to DockerHub and see the service available for yout to use. The is a nice documentation there also.
We explain the environment
later.
If we run docker-compose up
, we would have a copy of postgres available for us.
Dcoker-compose Config
Now let’s add the Redis service and then move to the server and specify config for that in the compose file:
version: "3"
services:
postgres:
image: "postgres:latest"
environment:
- POSTGRES_PASSWORD=postgres_password
redis:
image: "redis:latest"
api:
build:
dockerfile: Dockerfile.dev
context: ./server
volumes:
- /app/node_modules
- ./server:/app
We specifies the api service and some config we already know. build
is using the Dockerfile.dev to build the server service, but which one? That’s why we specified context
, so docker-compose knows which Dockerfile.dev to use (in this case, inside ./server).
Also specified the volumes, which are mapping everything from /app
directory (we defined /app inside the Dockerfile.dev, to receive all the builded files) to the ./server directory. The only exception is /app/node_modules
, which will stay the way it is.
That’s why everytime we change some code in the ./server, it maps to /app inside the container.
Environment variables with docker compose
The server uses a lot of environment variables to connect to postgres and redis. There are two ways of defining env variables in docker-compose.
Both of them set the variable in the container at run time. This means the variables are not builded with the container image, but only defined when we run the container.
variableName=value
. Defining like this, will make the container use the value defined.
variableName
. Defining like this, will make the container search for the value in your computer.
We will use variableName=value
most of the time.
server:
build:
dockerfile: Dockerfile.dev
context: ./server
volumes:
- /app/node_modules
- ./server:/app
environment:
- REDIS_HOST=redis
- REDIS_PORT=6379
- PGUSER=postgres
- PGHOST=postgres
- PGDATABASE=postgres
- PGPASSWORD=postgres_password
- PGPORT=5432
Define a environment and the key value pair for each variable. The value for REDIS_HOST is just redis, because is the name of the service we are using to start redis. The PORT is 6379 because is the default Redis port, is in the docs. All the PG variables are also in the DockerHub documentation, these are the default values
If we do docker-compose up
we should see everything running nicely.
The Worker and Client Services
Now we need to add services for both the worker and client projects. In docker-compose:
client:
build:
dockerfile: Dockerfile.dev
context: ./client
volumes:
- /app/node_modules
- ./client:/app
worker:
build:
dockerfile: Dockerfile.dev
context: ./worker
volumes:
- /app/node_modules
- ./worker:/app
Now we are almost done, but note we didn’t define Port mapping in any service. Let’s talk about Nginx first.
Nginx Path Routing
In the previous project, we used Nginx to host our React project in development mode. Now we will use Nginx in development mode to route the requests.
The browser will make request (to get HMTL and Javascript, or to make API requests) and the Nginx will route either to React server or to Express server.
We will add a new container for Nginx, adding a new service to the docker-compose file. It will intercept requests from the browser and, if the path have a /api, send to Express server. If begins only with ’/’, send to React server.
Why not use PORTS to define the React server and the Express server? Because Ports are not reliable and if we think about production environment, it’s so much better to rely on the request path.
Note that our route handlers in Express don’t have the /api path. That’s because after the request go through Nginx, Nginx will cut off the /api part and foward the request to Express. That’s optional, we could easily modify all routes in Express to begin with /api, but in this case not doing it.
Routing with Nginx
We will define a default.conf
file, which add config rules to Nginx. It will then be added to our Nginx image.
In this default.conf file, we will:
-
Tell Nginx there is an upstream server at client:3000 and server:5000.
-
upstream server is a server that is “behind” Nginx
-
client:3000 is the create-reat-app server, client is the service name we defined in docker-compose
-
server:5000 is the express server, server is the service name in compose.
-
3000 is the default port for creat-react-app
-
5000 is a Port we defined in Express
-
Tell Nginx to listen to Port 80 (inside the container)
-
Manage to send the requests to the right place
Now let’s create a folder nginx in the root folder, inside the folder create a default.conf
file:
upstream client {
server client:3000;
}
upstream api {
server api:5000;
}
server {
listen 80;
location / {
proxy_pass http://client;
}
location /api {
rewrite /api/(.*) /$1 break;
proxy_pass http://api;
}
}
Above we defined a upstream client, where the server is in the client:3000 (client is the name of the service that will be running)
Then defined the api upstream, which references the api service in the port 5000.
Then we defined the server, which is the Nginx server, and passed some configs. First Nginx should listen to Port 80, then we are intercepting the requests for ”/” and proxying to http://client, where client is our client upstream. The same way, we listen to “/api”, proxy to http://api but first we rewrite the path.
/api/(.*)
is a regex path, match everything that begins with /api. Then, with /$1
we are getting everything that was matched with the regex (.*) and writing it to the final result.
So basically we are erasing the “/api” part of the path. The break statemente guarantees the Nginx will go to the proxy_pass after the rewriting.
Building a custom Nginx Image
Now let’s create a Dockerfile to create our custom Nginx image, and then pass the Nginx configs to the Nginx image.
Go to DockerHub and see the nginx image. There is a lot of nice info there.
All the configuration for Nginx is done by these config files there are in the system. We just have to copy our config file to the Nginx image.
Create a Dockerfile.dev inside the /nginx folder and:
FROM nginx
COPY ./default.conf /etc/nginx/conf.d/default.conf
The base image is nginx and we are copying our default.conf into the nginx default.conf, actually we are overwriting the default file with our own file.
Now let’s add the nginx service to docker-compose:
nginx:
restart: always
build:
dockerfile: Dockerfile.dev
context: ./nginx
depends_on:
- api
- client
ports:
- "3050:80"
Using restart: always because we need nginx running all the time, since it’s responsible for routing our application to the right servers. It’s the only service with port mapping, because all other port mapping is done by the nginx service.
Troubleshooting Startup bugs
If any bug happens, read the error message.
But you can always run docker-compose down
and docker-compose up
again.
And if you make any changes to docker files, run docker-compose up --build
to rebuild the images.
Opening websocket connections
We are seeing a websocket error in the console if we open the localhost:3050. This is because our React app in the browser wants to connect to a websocket to talk with the React server, so everytime the react code changes, the browser knows and update. But we didn’t implement this with Nginx, so let’s do it now.
In the default.conf file, add one more route proxy:
server {
listen 80;
location / {
proxy_pass http://client;
}
location /sockjs-node {
proxy_pass http://client;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
}
location /api {
rewrite /api/(.*) /$1 break;
proxy_pass http://api;
}
}
Now rebuild the container again docker-compose up --build
A Continuous Integration Workflow for Multiple Images
Let’s review how we deployed our single container app previously:
- Pushed code to github
- Setup TravisCI to pulls our repo
- Travis builds the image running docker build and test the code (npm run test)
- Travis push the code to ElasticBeanstalk (using the .travis.yml file)
- ElasticBeanstalk take the code, build the image again and deploy to the web server.
This flow is not the best for multi containers apps, since ElasticBeanstalk has to build the image, pulls all the dependencies and etc, the server has to do much more than just serve the code.
For the multi container setup, we will:
- Push the code to Github
- Travis pulls the repo
- Travis builds a test image (the react image), just to test the code
- After the test is successfull, Travis builds a production image (with a production build script), using the Dockerfile for each one of the services
- Travis will then push the buit prod images (the images are just single files) to DockerHub service
- we can host our personal or organization projects in Dockerhub as well
- AWS and Google Cloud are natively tied up with DockerHub to pull images from there, so we can easily make ElasticBeanstalk get our built images from there
- Travis will send a message ElasticBeanstalk, tell them there is a new image to pull on DockerHub
- ElasticBeanstalk will pull the images from DockerHub (images are already builded) and use as base for a new deployement
We will no longer depend on ElasticBeanstalk to build our images, everything will be done by Travis, push to DockerHub and then we can deploy anywhere we want. DockerHub is like the central repository for the Docker world.
Production Dockerfiles
Let’s create production versions of the Dockerfile to each one of our services. Right now we only have development versions of Dockerfiles.
For the server, worker and nginx services, we basically just modify the CMD
, using a command more production appropriate, like npm run start
.
Multiple Nginx Instances
Now let’s create a production Dockerfile to our /client service, our React project.
In the single container Dockerfile we did previously for the react application, our Nginx server was at ElasticBeanstalk, in PORT 80. Everytime a user enter the website, a request is made to the PORT 80 and the React production files are served from Nginx.
Now, the architecture is different.
The browser will make a request into the Nginx server responsible for routing (in PORT 80) and send the request to the right server (react server or express server). The React server will be in PORT 3000, it’s a Nginx with production React files.
So we will have two copies of Nginx, one for routing in PORT 80 and one for serving the production React files in PORT 3000. We could have 2 different services here, not just Nginx. It could be Nginx for routing and other service to serve the React files.
Altering Nginx Listen Port
In the /client folder, create a /nginx folder and inside a default.conf file:
server {
listen 3000;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
try_files $uri $uri/ /index.html;
}
}
Everytime a user goes to ”/” path, share the files inside /usr/share/nginx/html
(we will put the react production files here). The index will be the index.html.
The last line of try_files
is to Nginx work along with react-router.
Also inside the /client, create a Dockerfile and:
FROM node:alpine as builder
WORKDIR '/app'
COPY ./package.json ./
RUN npm install
COPY . .
RUN npm run build
FROM nginx
EXPOSE 3000
COPY ./nginx/default.conf /etc/nginx/conf.d/default.conf
COPY --from=builder /app/build /usr/share/nginx/html
Github and Travis CI Setup
Inside /complex, initialize a git repo, create a github repo, set the remote and push to github. Then activate the github repo on Travis CI (on Travis CI website)
Travis Configuration Setup
Let’s create our .travis.yml file, to handle our build when we push code to Github.
We will:
- Specify docker as a dependency
- Build a test version of React project (using the development dockerfile)
- Run tests
- Build productionv ersion of all projects
- Push the images to Dockerhub
- Tell ElasticBeanstalk to update
We are only going to run test on the React project. We could very easily build test versions for other services like /server or /worker, and run tests along with the React project.
Create a .travis.yml inside the /complex:
sudo: required
services:
- docker
before_install:
- docker build -t ricardohan93/react-test -f ./client/Dockerfile.dev ./client
script:
- docker run -e CI=true ricardohan93/react-test npm test
after_success:
- docker build -t ricardohan93/multi-client ./client
- docker build -t ricardohan93/multi-nginx ./nginx
- docker build -t ricardohan93/multi-server ./server
- docker build -t ricardohan93/multi-worker ./worker
Before install we created a test version of the react project (./client) building the Dockefile.dev Then we runned the test inside the React project (we tagged the image with the name of ricardohan93/react-test) Then after successfully testing, we builded the production versions of the images.
Now we are ready to push these images to DockerHub
Pushing Images to Docker Hub
Before pushing to Docker Hub, we need to login to Docker CLI, using our username and password.
docker login
This will associate our docker client with our dockerhub account.
So before pushing an image to DockerHub, in the Travis script, we need to login to docker CLI.
We put the dockerid and password as Travis env variables, in Travis dashboard for the repo. And then, in the .travis.yml
after_success:
- docker build -t ricardohan/multi-client ./client
- docker build -t ricardohan/multi-nginx ./nginx
- docker build -t ricardohan/multi-server ./server
- docker build -t ricardohan/multi-worker ./worker
- echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_ID" --password-stdin
- docker push ricardohan/multi-client
- docker push ricardohan/multi-nginx
- docker push ricardohan/multi-server
- docker push ricardohan/multi-worker
We are login in the docker CLI and then pushing all the builded images into docker hub. Explaining the login part:
- by doing echo in the “$DOCKER_PASSWORD”, we passed the “$DOCKER_PASSWORD” to the other side of the pipe.
- so we are doing docker login, getting the “$DOCKER_ID” and then retrieving the password which was passed as an standard input (stdin)
- The “$DOCKER_ID” and “$DOCKER_PASSWORD” are set as environment variables from Travis.
That’s it. If we add, commit and push this .travis.yml file to github, Travis will awake and start build our project with all the steps we defined.
Multi-Container Deployments to AWS
Multi-Container definition files
Now we have a pre-deploy setup ready. Every time we push code to github, Travis CI build new images and push to Docker Hub. Now we need to use these images and deploy them in production.
We will use Amazon ElasticBeanstalk. Before with a single container, Amazon BE builded the image automatically. Now we have many services, and many dockerfiles. We have to tell Amazon EB how to treat the project.
We will create a config file Dockerrun.aws.json
, to tell EB where to pull the images from, what resources to allocate to each, setup PORT mappings, etc… Remind a lot of docker-compose.yml
Basically docker-compose is for development and Dockerrun.aws.json is for production in AWS. In docker-compose we have separate services, but in Dockerrun they are called “Container Definitions”.
The biggest difference is that docker-compose has directions to build the images. Dockerrun is used to pull the image from DockerHub (images are already builded) and then Amazon EB can use the image
Finding Docs on Container Definitions
So, in reality ElasticBeanstalk don’t know how to handle containers. It delegate the job to another service called Amazon Elastic Container Service (ECS). Inside Amazon ECS you can create task definitions, which are instructions on how to run a single container. These task definitions are very very similar to the Dockerrun “Container definitions”. So in order to work with Container Definitions, we can look into the documentation of ECS task definition.
Adding conatiner definitions to DockerRun
In the root folder /complex, create a Dockerrun.aws.json
file:
{
"AWSEBDockerrunVersion": 2,
"containerDefinitions": [
{
"name": "client",
"image": "ricardohan/multi-client",
"hostname": "client",
"essential": false,
}
]
}
Inside containerDefinitions
, each entry (or object) is a different container that will be deployed to AWS.
It has a name
(can be any name but we mantain the pattern and name client as well). Also has a target image
, and we can just specify our DockerHub name and the image name because these deployement services like AWS have integration with DockerHub. That’s why we can reference ricardohan/multi-client
directly.
Also have a hostname
. Remember the “services” names in docker-compose, where we specify each container we had? All those names inside the “services” were like a hostname, so every container managed by docker-compose could reference other containers just by it’s hostname.
This hostname
here is the same, but for AWS context.
essential
is false because indicates this container “client” is not essential. If a container is essential true, when it has a problem, all other container will shut down together with the essential one. At least one container here has to be essential: true.
More Container Definitions
{
"AWSEBDockerrunVersion": 2,
"containerDefinitions": [
{
"name": "client",
"image": "ricardohan/multi-client",
"hostname": "client",
"essential": false
},
{
"name": "server",
"image": "ricardohan/multi-server",
"hostname": "api",
"essential": false
},
{
"name": "worker",
"image": "ricardohan/multi-worker",
"hostname": "worker",
"essential": false
}
]
}
Forming Container Links
Now let’s create the container definition for the Nginx container:
{
"name": "nginx",
"image": "ricardohan/multi-nginx",
"essential": true,
"portMappings": [
{
"hostPort": 80,
"containerPort": 80
}
],
"links": ["client", "server"]
}
Nginx doesn’t need a “hostname” because no other container is referencing Nginx by name.
We have a portMappings
here, which just map the host port to the container port. So when accessing the port 80 on the host (the server), we will reach to the port 80 inside the container, which is the default port for nginx.
We also defined a links
property, it makes a explicit connection beetween the nginx and the dependent containers, in this case client and server. It’s like nginx has to know about the client and server (these names map to the name
defined in the containerDefinition)
Creating the EB Environment
Go to the AWS console, search for Elastic Beanstalk.
Create a new project, then create a new environment, a web server environment, the plataform should be Multi-conatiner Docker
, and is a sample application (form option). Then create your environment.
Note that in all discussion about the containers we never talked about Postgres and Redis! Let’s talk about running databases in containers.
Managed Data Service Providers
In our Dockerrun.aws.json file, we have nothing with Postgres or Redis, as opose to our docker-compose in which we have declared the Postgres and Redis instances. Let’s talk about that.
Remembering that docker-compose is used for development environment and DOckerrun.aws.json is for production.
In the development environment, we have our Redis and Postgres running as containers, along wth the server, worker, client…
In the production environment, we will change this architechture:
Inside the Elastic Beanstalk instance, we will run the Nginx, the Express server, the Nginx with client files and the worker. We already setup these services to run with Amazon EB, inside the Dockerrun.aws.json.
The Redis and Postgres will be in external services: Postgres will be in RDS (Amazon Relational Database Service) and Redis will be in AWS Elastic Cache. These services are general services that can be used with any application.
But why we will use them to host Redis and Postgres instead of defining our own containers to host them?
AWS Elastic Cache:
- Automatically create and mantain an instance of Redis for us, with all production ready features.
- Super easy to scale
- Built in log and maintainance
- Security
- Easier to migrate off ElasticBeanstalk, if we ever want to go off EBS because Elastic Cache is completely separate from it.
AWS RDS:
- Same points above.
- Automated backups and rollbacks
- Make database backups are hard to do
In this project we will use these external AWS products, and it’s recommended, although it cost money, is a small amount and save a lot of our time.
In the next project we will learn how to setup Postgres and Redis from scratch though.
Overview of AWS VPC’s and Security Groups
We will have our ElasticBeanstalk with our containers inside. Then we will have to connect some of these containers with the RDS (postgres) and with EC (Redis). By default these services cannot talk to each other. But we can create a connection beetween them, using a set of configuration in AWS. It is completely not connected with Docker.
AWS works with regions, you can see your region in the console. Each region has a default VPC (Virtual Private Cloud) set just for your account. VPC is like a private network for your services in that specific region. So our ElasticBeanstalk, RDS and EC will be inside a VPC in us-east-2 (Ohio) for example.
VPC is also used to implement a lot of security rules and connect services.
To make the services inside a VPC to connect to each other, we have to create a Security Group. Security Group is basically a firewall rule. It allow or disallow different services to connect and talk to your services in your VPC.
For example, in our elasticBeanstalk, we allow any IP in the world to connect, as incoming traffic, into PORT 80 of our EBS service.
To see the security groups, on VPC, click the Securty Group column, you can see your service and the inbound and outbound rules for it.
To make our services talk to each other, we have to create a security group and allow other AWS services in the same Security Group to talk to our service. That’s how we make them talk to each other.
RDS Database Creation
Make sure your are in the same region as your EBS instance. Search for RDS in the console.
Click in “Create Database”, select your database type (use free tier to not pay).
In the settings, choose a identifier name and your credentials. Remember the master username and password, because you will need them to connect you EBS to the Postgres. Fill some other aditional configs and create you database.
ElastiCache Redis Creation
Search ElastiCache in the console. Find Redis in the left column and create a new cluster. Just make sure in the “Node type”, you have selected the T2 micro instance, otherwise you’ll pay a lot of money. Also change the numebr of replicas to 0, we don’t need any replicas.
Make sure your VPC ID is the default VPC, so we can communicate to EBS and other services in our VPC. Then create your Redis instance.
Creating a custom security group
Now that we created Redis and Postgres, let’s create the security group for these services to communicate with each other and EBS.
Search VPC, Security Groups, create new security group. Choose the name and a description, and select the default VPC.
Add a inbound rule:
- Choose the type as Custom TCP rule
- Protocol is TCP
- Port Range: we could open all Ports, but since we only need to open the Postgres PORT and Redis PORT, let’s do the range of 5432-6379
- Source: put your security group id here, so the services can connect to each other within the same security group.
Applying Security Groups to Resources
Now we created the security group rules, we need to apply them to the services (RDS, EBS, EC), so they actually belong to the security group.
First EC (redis). Search in the console, go to your created instance and click the actions button and then modify. In the VPC security group, edit and choose your created security group (can leave the default selected as well)
Now to RDS, click in your database, modify button and select your new created security group. Then apply the changes and that’s it.
Now to EBS. Select your instance, click the configuration column on the left side. Click in “Instances” and select your new security group. Apply and confirm the changes.
Setting environment variables
Now we need to guarantee our containers inside EBS know how to reach out to RDS (Postgres) and EC (Redis). For this, we need to set env variables, just like we did inside docker-compose.
In docker-compose the “api” and the “worker” have environment variables defined. So we have to define them in AWS as well.
In the EBS page, go to configurations and modify software. You can see a key-value input to place your env variables.
For REDIS_HOST, we cannot simply put “redis” as is in docker-compose. We have to put the endpoint of ElastiCache. Go to ElastiCache -> redis -> primary endpoint and copy the URL, not the PORT. Do the same for env variables like PGHOST. Go to RDS and see the host endpoint and paste in the env variable section.
All the env variables defined in the EBS are applicable to the containers inside EBS.
IAM Keys for Deployment
In the travis.yml file, we are building the production images and pushing them to DockerHub. Now we have to send a message to EBS and say: “It’s time to pull these images and deploy them!”
We will send the entire project to EBS, but EBS only need to see the Dockerrun.aws.json file. It will see this file and pull the images from DockerHub.
Before writing things down to travis.yml file, let’s generate a new set of keys in AWS IAM console.
Search for IAM -> Users (to create a new user with deploy credentials to EBS)
Add a User, set a name and give programatic access. Then set the permissions, attach existing policies directly. Search for beanstalk and add all the policies.
These will generate a ID Key and Access Secret key.
Open your Travis dashboard, go to your multi-docker project, options, settings and add the aws_secret_key and aws_access_key env variables.
Now we just have to write the deploy script in the travis.yml file.
Travis Deploy Script
deploy:
edge: true
provider: elasticbeanstalk
region: us-east-2
app: multi-docker
env: MultiDocker-env
bucket_name: elasticbeanstalk-us-east-2-709603168636
bucket_path: docker-multi
on:
branch: master
access_key_id: $AWS_ACCESS_KEY
secret_access_key: $AWS_SECRET_KEY
“edge” is just to make sure Travis uses the dpl (deploy?) script v2, which doens’t have a bug. region, app, env you can find in the EBS console. The bucket_name and bucket_path are the S3 bucket that Travis uses to zip all the project files. Go to S3 and make sure there is a bucket you can use, and copy the name. The bucket_path is a name you can give.
We also tell to deploy to EBS only when the branch master is builded. And finally put the aws keys we defined in the Travis console.
Container Memory Allocations
In our Dockerrun.aws.json file, we need to put the amount of memory allocation we want for each service. Recommend to search for posts about best practices on memory allocation for containers, but in this project we will set 128mb for each one:
"memory": 128
Verify Deployment
You can see the logs of the deployment in the EBS console. If there is an error, go through the logs and search for it.
Cleaning up AWS Resources
We have to shut down the EBS (ElasticBeanstalk), RDS and EC (ElastiCache).
We don’t have to delete the security group we created because there is no billing. But we will clean up anyway.
Go to EBS, see your application, actions, delete application. It takes a while to terminate. In RDS, go to you databases, click in the checkbox, instance actions and delete it (don’t create snapshots). In EC, go to Redis tab, select you checkbox and delete button. And to clean the security group, go to VPC, secutiry groups, select you checkboxes of multi-docker and actions buttons, delete it. We can also delete our IAM user, the multi-docker-deployer.
Onwards to Kubernetes
Imagine we have to scale up the previous project. We had an EBS instance with Nginx, server, client and a worker container. To scale the app, we needed the worker to scale up, because it’s the worker who does the heavy job of calculations of fibonacci values. But EBS only scale the whole instance, we cannot control the scalability of one container only, we cannot scale only the worker container to 10 worker containers. That’s where Kubernetes come in.
With Kubernetes we have a Kubernetes cluster, which is formed by a Master + Nodes. Each Node is a Virtual Machine or a Physical Machine containing one or more containers, that can be of different types. One Node can have 10 worker containers, 1 worker and 2 client containers, only 1 server container, whatever. The Master is responsible for managing and controling what each Node does.
External services can access the kubernetes cluster through a Load Balancer, who will manage the incoming requests and send them to the Nodes of the cluster.
So Kubernetes works for that case, when we need to scale up different amount of containers, when we have to control how and which containers to scale up.
Kubernetes in Development and Production
It’s very different using kubernetes in development mode and production.
In development, we use a CLI tool called minikube
, which sole purpose is to run a kubernetes cluster on my local computer.
In production, we use managed solutions, or 3th party cloud providers like AWS (EKS) and Google Cloud (GKE). We also can set up a environment our own, but managed solutions it’s secure and better.
How Minicube works?
It sets up a Virtual Machine (single node) that runs multiple containers for you, in your local machine.
To interact with this single node, we use a tool called kubectl
, to manage and access the containers.
Minicube we use only in DEVELOPMENT. Kubectl we use in production too.
We have to install a Virtual Machine driver to run the VM in our local machine. It will be VirtualBox.
Installing Kubectl, Minikube and VMBox
Just followed installation guides
Mapping existing Knowledge
After all the installing, do minikube start
, and then minikube status
You can see the server running in our VM.
kubectl cluster-info
you can see the kubernetes running.
Our first task is to run the multi-client (the react app for multi project) image in our local kubernetes Cluster as a container.
To do this, we have to:
- Each service can have, optionally, a docker-compose to build an image
- Each service represents a container
- Every service define some networking requirements like PORT mapping.
Into the Kubernetes world, these points would be:
- Kubernetes expects all images to be already build. We need to do this outside the kubernetes.
- We have multiple configuration files, that will create a diffferent object (that is not necessarily a container)
- We have to manually set up all networking. There is no one file to manage PORT mapping, sadly =|
To make sure we can do these points in local Kubernetes:
- Make sure our image (multi-client) is built and hosted on docker hub
- Create one config file to create our container
- Make one config file to set up networking between our container and the outside world
Adding Configuration Files
Let’s create a new project directory and create a new config file. The project is simplek8s
Create client-pod.yaml
. The goal of this file is to create our container
The other file is client-node-port.yaml
, to handle our networking
Object types and API Versions
Now explaining what those files do, in detail:
First, let’s see what the property kind
means in these .yaml files.
With Docker Compose, Elastic Beanstalk and Kubernetes, we have to have a config file describing the containers we want. With Kubernetes, we are not creating the containers in the .yaml config file. We are creating an object.
We will take these Kubernetes config files we created and put into kubectl. Kubectl will interpret them and create objects from these files. We are creating specific type of these Objects. It could be “StatefulSet”, “ReplicaController”, “Pod”, “Service”…
So the kind
in the config files mean the type of Object we want to create in Kubernetes.
What is an Object? They are things that will be create inside our Kubernetes Cluster. Some objects will be used to run a container, like the Pod
type. Other will monitor the container, Service
will set up networking in our containers…
Now about apiVersion
, it just serves to scope the Object type we have available into the config file.
We specified apiVersion: v1, this means we have access to some set of Objects.
If we specified apiVersion: apps/v1, we would have acces to other type of Objects.
Running containers in Pods
What is a Pod?
When we ran minicube start
, it created a VM. It is a Node. This Node will be used by Kubernetes to run some number of diferent Objects. The most basic Object is a Pod. So when we specify kind: Pod
, we are creating a Pod into our Virtual Machine.
Pod is basically a group of container with a very common purpose. In Kubernetes we cannot create a container standing alone into a Node. We need to create a Pod, is the smallest thing we can deploy to run a container in Kubernetes.
But why we would have to make a Pod? In Pod we have to run one or more containers inside it. But the goal of a Pod is to group containers with a very similar purpose, tight relationship. So it’s wrong to just add any containers to a single Pod.
A good example of Pod with multiple containers is a Pod with a Postgres container, and a logger and backup containers as well. The logger and backup are tighly integrated with the Postgres, and could not function without it.
So looking at our file, it makes more sense:
apiVersion: v1
kind: Pod
metadata:
name: client-pod
labels:
component: web
spec:
containers: - name: client
image: ricardohan/multi-client
ports: - containerPort: 3000
We are creating a Pod with one container running on it. The container is named “client”, we can use this name to debug or to create connections with other containers in the Pod. We are also exposing PORT 3000 to the outside world. But this is not the full story to PORT mapping, this line alone will not do anything, it need the second config file for networking.
About the metadata
field, the “name” is the name of the Pod. “labels” is coulped with the other config file, the kind: Service.
Services Config files in depth
Another type of Objects is “Services”, sets up networking in a Kubernetes Cluster.
There are subtypes for the Service type:
- ClusterIP
- NodePort
- LoadBalancer
- Ingress
Subtypes are specified inside specs.type
.
Our exemple is a NodePort service. It exposes a container to the outside world, so we can access the running container in our browser. In production we don’t use it, except a few exceptions. We will talk about the other subtypes later.
So explaining how NodePort works (remember, only development):
Imagine we have a browser request. The request enters the Kubernetes Node through a kube-proxy, that comes with every Node. This proxy job is to redirect the request to the right Service, and then the Service can foward the request to the right Pod.
But how our files communicate to each other? How our Service file knows it has to handle request to our Pod? It uses the label selector system.
In the Pod file, we have inside metadata -> labels -> component: web And in the Service file: spec -> selector -> component: web
That’s how these objects relationship works. The Service looks to Pods that have component: web
The key-value pairs are totally arbitrary, but they have to match. So it could be tier: frontend
for example.
Now let’s talk about the collection of PORTS we have in the Service.
ports:
- port: 3050
targetPort: 3000
nodePort: 31515
The port
is the PORT another Pod or another container could access to communicate with our multi-client Pod.
The targetPort
is the PORT inside the Pod, it means we want to send the traffic to this PORT
The nodePort
is the PORT we, developers, will access in the browser to access the Pod. (we can specify a number to the range of 30000-32767)
One of the reasons we don’t use nodePort in production because of these weird PORT numbers
Now let’s load these Object files into kubectl tool and try to access the running container
Connecting to Running Containers
kubectl apply -f <filename path>
kubectl: is what we use to manage our kubernetes cluster. apply: means we want to change the configuration of our cluster -f: we want to specify a file
So we will run this command twice, one for Service and one for Pod.
kubectl apply -f client-pod.yaml
kubectl apply -f client-node-port.yaml
Let’s check if they created successfully: kubectl get <objectType>
kubectl get pods
It prints:
NAME READY STATUS RESTARTS AGE
client-pod 1/1 Running 0 109s
1/1 means: it has 1 Pod running and 1 copy of this Pod.
kubectl get services
Prints:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
client-node-port NodePort 10.111.105.31 <none> 3050:31515/TCP 3m58s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 147m
The kubernetes is a default.
Well, now our Objects are running into our Cluster. But we cannot acces it on the browser yet. Because our Kubernetes Node running as a VM created by Minikube has to give us an IP address, because we don’t have access to it from the localhost
So in the terminal: minikube ip
This will give us an IP address. We won’t be using localhost when using Kubernetes with minikube
And we have to put the PORT we inserted into nodePort
. Something like 172.17.0.2:31515
The entire deployment flow
What happen when we put the config files into kubectl?
Let’s imagine we have a config file with the goal of running 4 instances of multi-worker image.
We make the command to deploy this file, using kubectl apply
. Then, the file goes to the “Master”, is a general manager of the Kubernetes cluster. Inside it, we have 4 programs, but for simplicity, let’s imagine it has only the “kube-apiserver”
kube-apiserver is responsible for monitoring the Nodes of the Cluster and making sure they are operating okay. It will also interpret the configs files we inserted into it and update the internal list of responsabilities, following our command. Something like:
“I need to run multi-worker, 4 copies, and I’m currently running 0 copies of it”
So it will reach to the running Nodes of the Cluster. Remember, the Nodes are just VMs running Docker. The kube-apiserver will say: hey Node1, run 2 copies of multi-worker, Node2 runs 1 copy and Node3 runs 1 copy. In total we will have 4 copies, as the config file asked.
Each Docker running inside the VMs will reach out to DockerHub and pull the image and create a container of it. The Node1 will create 2 containers.
The “master” will make a check and see everything is good, with 4 copies running. The “Master” is continuosly whatching the Nodes, to check everything’s good. If a container dies, the Master emit a new message to the Node, to run another copy of multi-worker.
All commands pass through the “Master”, and the Master reach out to the Nodes.
The Mater having the list of responsabilities to fullfill and constantly check the Nodes to see if the list is okay, is one of the main concepts of Kubernets
Imperative and Declarative deployments
Important Notes:
- Kubernetes is a system to deploy containeireized apps, run containers
- A Node is like a physical Machine or VM that runs some number of Objects (or containers)
- Master is a machine (or VM) to manage everything that has relation with Nodes
- Kubernetes don’t build the images, just get the images from DockerHub
- The Master decides where to run each container. We can control it with configuration file, but the default is the Master deciding alone
- To deploy something, we update the list of the Master with our config files (like saying, hey Master, need you to create 4 copies of multi-worker)
- The master constantly check to see if the list is fullfilled
There are two ways of approaching deployments:
- Imperative deployments
- Declarative deployments
Imperative giver orders like “Create this container!” Declarative just say to the Master “It would be greate if you could run 4 copies of this container”, just directions
Imperative you have to set every step of the flow. Declarative you just say what you want, and Kubernetes do the rest
When we research the web, we will see lots of people saying to run imperative approach, giving commands and etc.. But Kubernetes is nice enought to have it’s own commands. So be skeptic when you read a blog post to update a Pod in Kubernetes. But we can just update a config file and send it to Kubernetes Master, for him to handle automatically