Advanced Docker

In the last post, we learned how to assemble a multi-container stack to serve a single web application (WordPress). This post will complete our Docker-specific exploration, and future posts will build upon the skills we’ve learned here.

We will:

  • Organize Docker application stacks on the file system
  • Isolate application stacks with dedicated networks
  • Discuss best practices for image versioning
  • Discuss running a stack in the console vs. detaching
  • Discuss update mechanisms (automatic vs. manual) and their pros/cons
  • Keep our system clean with prune commands

Organizing Stack Files

In the previous post I took a two-container stack straight from the WordPress Docker Hub page. It used the stack.yml filename, which I left as-is for the sake of consistency. I prefer to use the default filename docker-compose.yml for the simple reason that the -f option isn’t needed.

docker-compose, executed in a directory with a docker-compose.yml file, will automatically use that file. Less typing, more results.

I create a special /docker directory on my host, with sub-directories inside with application stack name. It’s common to see docker-compose.yml files online with tens or hundreds of containers, but I prefer to divide them up a bit.

This blog is hosted on a VPS (more on this later), and here is the structure of my /docker directory.

linodevm:/docker# ls -l
total 36
drwxr-xr-x    2 root     root          4096 Jul  5 06:05 bowtieddevil
drwxr-xr-x    2 root     root          4096 Jun 27 05:31 ipfs
drwxr-xr-x    2 root     root          4096 Apr 18 03:06 traefik

Inside the bowtieddevil, ipfs and traefik directories, I have a single docker-compose.yml file.

With this approach, I can interact with the “bowtieddevil” application as a whole instead of the individual containers that make it up. Very useful, since having an all-in-one approach would require me to carefully turn individual containers on and off to avoid disrupting other services.

Stack-specific Naming

When you bring up a docker-compose stack, it will avoid nasty naming conflicts by automatically including a text string at the start of each container, network, and volume. The string defaults to the directory name where the docker-compose.yml is stored.

For example: container app inside file bowtieddevil/docker-compose.yml will become bowtieddevil_app. Similar behavior for networks and volumes.

For this reason, it’s best to place everything in suitable directories and give your containers very short (but descriptive) names. Here is the stack for this blog:

version: "2"

services:
  web:
    image: nginx:alpine
    restart: unless-stopped
    volumes:
      - /blog/bowtieddevil/public:/usr/share/nginx/html/
    labels:
      - traefik.enable=true
      - traefik.http.routers.bowtieddevil-web.entrypoints=websecure
      - traefik.http.routers.bowtieddevil-web.rule=Host(`bowtieddevil.com`) || Host(`www.bowtieddevil.com`)
      - traefik.http.services.bowtieddevil-web.loadbalancer.server.port=80

  web-staging:
    image: nginx:alpine
    restart: unless-stopped
    volumes:
      - /blog/staging/public:/usr/share/nginx/html/
    labels:
      - traefik.enable=true
      - traefik.http.routers.bowtieddevil-web-staging.entrypoints=websecure
      - traefik.http.routers.bowtieddevil-web-staging.rule=Host(`staging.bowtieddevil.com`)
      - traefik.http.services.bowtieddevil-web-staging.loadbalancer.server.port=80

  stats:
    image: matomo:4
    restart: unless-stopped
    volumes:
      - stats:/var/www/html
    env_file:
      - stats.env
    depends_on:
      - stats-db
    labels:
      - traefik.enable=true
      - traefik.http.routers.bowtieddevil-stats.entrypoints=websecure
      - traefik.http.routers.bowtieddevil-stats.rule=Host(`stats.bowtieddevil.com`)
      - traefik.http.services.bowtieddevil-stats.loadbalancer.server.port=80

  stats-db:
    image: mariadb:10
    command: --max-allowed-packet=64MB
    restart: unless-stopped
    volumes:
      - stats-db:/var/lib/mysql
    env_file:
      - stats-db.env

volumes:
  stats:
  stats-db:

networks:
  default:
    name: bowtieddevil

There’s a lot going on, but here are the parts and pieces:

  • A container named web runs an nginx web server. It reads from the /blog/bowtieddevil/public directory that contains the HTML.
  • A similar web-staging container is used for testing themes and publishing drafts.
  • A container named stats runs an instance of Matomo (self-hosted analytics).
  • A container named stats-db runs a mariaDB (mySQL clone) for use by the stats container.

Environment and Labels

You’ll also see a few more options that we’ve not covered. The first is env_file, which is simply a file with environmental variables defined on each line. This is functionally identical to writing them out in an environment: block, but separating them reduces the size of the docker-compose.yml file and allows me to exclude them from a repository when I use version control (TO-DO REMINDER).

Labels are a nice feature of docker-compose.yml that behaves exactly how you’d expect. You can add any label to any container for any reason, and some containers have been developed that will change behavior based on the labels of other containers. Traefik is one of them. We will cover Traefik in detail later, but for now please know that it acts as a reverse proxy for my server. It manages SSL certificate renewal and directs traffic to and from containers without needing to explicitly expose ports to the Internet. All HTTP/HTTPS traffic comes through Traefik, which means my containers can simply serve their content without managing anything else. Great stuff that we’ll cover soon.

Docker Networks

Docker has the concept of a network, which is simply a subnet with containers assigned to it. If that doesn’t mean anything to you, that’s OK. You largely don’t have to understand networking to get the job done, but it does help to understand the big picture.

A subnet is a range of IP addresses that can communicate with each other. They are hidden behind the Docker daemon using NAT (Network Address Translation). The end result is that a container can talk to any other container in its network, but that’s it. A container is not reachable from outside its network, or from the host unless it has a forwarded port.

Since we are security-conscious people, we recognize that jamming a bunch of containers together and allowing them to talk freely might lead to trouble. The solution is to use Docker networks to isolate containers as much as possible from one another.

Hence the network: block at the bottom of my docker-compose.yml file. If a network is not specified, it will default to default. That’s OK! The only reason I’ve added the special name: bowtieddevil option is that my Traefik instance needs to join a specific network to pass HTTP/HTTPS along to the appropriate container. I found it more readable to join the network bowtieddevil compared to the network bowtieddevil_default (refer to the Stack-specific Naming section above).

Image Versioning

I glossed over images in the last post, but I will explore them more fully here. In general, a well-constructed Docker image will be published with several different versions. In the image: section of docker-compose, you will see something with the form repo/project:tag. There are a special set of images on Docker Hub known as “official”, and they are published without the repo portion of the name. nginx is one.

For the tag, you have many options. If the tag is omitted, it will default to latest which is simply the newest image. I do not recommend using latest (either by default or by choice) unless you have to. The wonderful thing about Docker is that you can manage the versions of application containers without affecting any other part of your system, so there’s no need to be aggressive about it. Especially if you’re running high-availability or mission-critical software, stability is your friend!

You can see that I have “pinned” the matomo image to version 4, and the mariaDB image to version 10. Some applications do not handle version upgrades in a clean way, so I prefer to handle major version changes by hand.

Attach vs. Detach

In the previous WordPress post, we ran our application stack in the CLI entirely using docker-compose up. This allows you to watch output from the containers, as well as kill the stack easily using CTRL + c. But what if you wanted to close your terminal and keep the application running? What if you get disconnected? Your application wil die when the shell session dies.

The answer is the -d option, short for --detach. When this is used with the up command, the containers will be detached from the running shell into the background. They will continue running without displaying any more output, and will live forever in the background.

When you are running a stack for the first time, I recommend using docker-compose up to see the output from the containers. Once it works the way you expect, kill it and restart with docker-compose up -d. Easy!

The downside of using -d is that logs are slightly more cumbersome to view. You can use docker-compose logs -f to see output from the whole stack, or docker-compose logs -f [container name] if you care about a particular container.

Updates

By default, Docker has no automatic update mechanism. That is a good thing! A container should not be subject to any start/stop/restart/update behavior unless you are ready for it.

Updating a container is a two-part process:

  • First, use the docker-compose pull command to check for image updates (based on the image tags from your docker-compose.yml file).
  • Second, use the docker-compose up -d command to recreate all updated containers. Old containers will be stopped and removed before the new one is created.

There is a solution for automatic updates called watchtower. It will watch for new images for all containers with a particular label (remember those?). I use this in very isolated cases, but in general do not recommend automatic updates via Watchtower.

I will cover scheduled updates via cron in a future lesson.

Cleaning Up

Risks Ahead! Proceed With Caution!

As you continue to update containers, you will accumulate a big directory full of old images. Docker will not remove them, so you need to clean them with a handful of commands. I use docker system prune and docker image prune periodically. These commands will delete all unused containers, images, networks, and volumes.

The danger here is that a stopped container (due to error, accident, or planned downtime) will be removed by the above commands. For this reason I never recommend automatic updates on important services.

Newsletter

docker 

See also