Persisting Storage in Docker

Generally speaking, Docker containers should have everything they need access to baked into the image. There are times, however, that it may be necessary to have additional files or directories provided to the container to persist information. These can include, but are not limited to:

  • Configuration files
  • Data persistence (usually only for local databases for development)
  • Application package hotswap during development
  • Saving artifacts generated by the application

Docker has two ways to provide such storage: bind mounts and volumes.

Bind Mounts

Bind mounts are used to provide access to a directory on the host machine. On a Linux host, Docker allows you to bind a user defined directory into the root filesystem of the container, effectively allowing you to do the equivalent of mount –bind for your directly to link it directly to the container’s filesystem. This is ideal for providing custom configuration files or saving off build artifacts to your host directory. To mount a directory into a container, execute the following on an example container:

mkdir -p testsite
echo "Hello, world!" > testsite/index.html
docker run -d --rm --name test --mount type=bind,source=$(pwd)/testsite,target=/usr/share/nginx/html -p 80:80 nginx

This creates a simple dummy site and then pulls down the nginx image, running it with the content of our simple site. If you open your web browser to http://localhost, you will see the “Hello, world!” message that we left in our sample directory. Alternatively, instead of the –mount option, you can use the older style -v syntax:

docker run -d --rm --name test -v $(pwd)/testsite:/usr/share/nginx/html -p 80:80 nginx

It is recommended that you use the –mount  option as it is more precise in its definition. The -v  option is only still available for legacy purposes.

We can inspect the container and see that the mount is defined for our container by using docker inspect test :

...
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/tmp/testsite",
                "Destination": "/usr/share/nginx/html",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
...

We can specify bind mounts in a compose file as such:

version: "3.2"
services:
  web:
    image: nginx:alpine
    volumes:
      - type: bind
        source: ./testsite
        target: /usr/share/nginx/html

You can also map an individual file onto a container, but it is rare to do so. If using the -v syntax, note that if the file is missing, a directory will be created with the name of the file that you specify. This can be confounding if you use this in a Compose file. More can be found on the Docker website:

https://docs.docker.com/storage/bind-mounts/

Bind mounts on Docker for Mac do not use native bind mounting, but instead uses osxfs to attempt to provide a near-native experience for bind mounts. It is still slower than a native bind mount running on Linux, but should still work seamlessly with local HFS+ filesystems. By default, it only has access to the /Users, /Volumes, /private, and /tmp directories. See official documentation details on Docker’s official website:

https://docs.docker.com/docker-for-mac/osxfs/

Docker Volumes

Docker volumes are file system mounts that are managed completely by the Docker engine. Historically, these have been called “named volumes,” just in case you see a reference to it in literature or in command line help or error messages. When a Docker volume is created, the directory is stored in the /var/lib/docker/volumes/  directory. The typical use case for a named volume would be for something like data persistence or sharing data between containers. Let’s dig out the Pastr app from the first tutorial. We’ll add the mount in the docker-compose.yml  file:

  database:
    image: redis:latest
    volumes:
      - type: volume
        source: pastrdatastore
        target: /data
    ports:
      - "6379:6379"
...
volumes:
  pastrdatastore:

The top-level volumes directive (at the bottom of the snippet) denotes that a datastore shall be created via this compose file. After starting the container with docker-compose up -d the Docker engine will create the pastrdatastore volume.

$ docker volume ls
DRIVER              VOLUME NAME
local               pastr_pastrdatastore
$ docker volume inspect pastr_pastrdatastore 
[
    {
        "CreatedAt": "2018-09-25T20:13:41-05:00",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "pastr",
            "com.docker.compose.version": "1.22.0",
            "com.docker.compose.volume": "pastrdatastore"
        },
        "Mountpoint": "/var/lib/docker/volumes/pastr_pastrdatastore/_data",
        "Name": "pastr_pastrdatastore",
        "Options": null,
        "Scope": "local"
    }
]

It creates it a volume in the  /var/lib/docker/volumes  directory. This volume is mounted to the database container, which we see when we inspect it with docker inspect pastr_database_1 .

        "Mounts": [
            {
                "Type": "volume",
                "Name": "pastr_pastrdatastore",
                "Source": "/var/lib/docker/volumes/pastr_pastrdatastore/_data",
                "Destination": "/data",
                "Driver": "local",
                "Mode": "rw",
                "RW": true,
                "Propagation": ""
            }
        ],

Note that on a Linux machine, this volume exists on the native filesystem. However, on a Windows or Mac system, this volume exists within the virtual machine; you can’t access it directly, nor should you try to, even on a Linux machine. If you need to mount the data store to inspect its contents, you can run it with docker run -it –rm –mount source=pastr_pastrdatastore,destination=/mnt ubuntu /bin/bash.

For more details, please see the official Docker documentation:

https://docs.docker.com/storage/volumes/#start-a-service-with-volumes

Native Docker Linux vs Hypervisor Docker for Mac and Windows

Before I start delving further into Docker tutorials, I feel that I should go over the differences between Docker running natively on Linux versus running Docker on virtual machines on Mac and Windows.

Docker for Linux (Native)

Natively, Docker runs on Linux, taking advantage of direct access of the host Linux kernel. You can prove this by running the following:

$ uname -a
Linux myhost 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ docker run --rm --entrypoint="uname" alpine -a
Linux 1a00a2571242 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018 x86_64 Linux

The second line is just a weird quirk of the entrypoint option (read more here). But it still shows us that the kernel that your application thinks its running on is actually the kernel of the host machine.

Let’s also take a look at the network stack.

$ docker run --detach --rm --name test dockercloud/hello-world
27ef4ca7a0dd68ed54e37bb828e978c004a28185760b20197b7aa04a96aaa2f3
$ docker inspect test
[
    {
        "Id": "27ef4ca7a0dd68ed54e37bb828e978c004a28185760b20197b7aa04a96aaa2f3",
        ...
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "adf503ce45806a338588cfdac016688007cfd95d603c8aba44c28d37e95baa46",
            "Ports": {
                "80/tcp": null
            },
            "SandboxKey": "/var/run/docker/netns/adf503ce4580",
            "Gateway": "172.17.0.1",
            "IPAddress": "172.17.0.2",
            "MacAddress": "02:42:ac:11:00:02",
            "Networks": {
                "bridge": {
                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.2",
                    "MacAddress": "02:42:ac:11:00:02",
                    "DriverOpts": null
...
]
$ ip addr
...
20: vetha777d55@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
link/ether f2:b3:ec:e8:43:4a brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::f0b3:ecff:fee8:434a/64 scope link 
valid_lft forever preferred_lft forever

I stripped out a lot of extraneous information but kept in the important bits. When you look at your network interfaces, you’ll see your normal loopback and ethernet, but you’ll also notice a veth device that wasn’t there before. This device has the same MAC address as the one assigned to the Docker container, as well as the same IP address. You can actually ping this IP address or reach the open port (172.17.0.2:80) in your web browser without having to do any port forwarding.

Docker for Mac

When running Docker on macOS, if we try to look up the kernel, we get the following:

>> uname -a
Darwin myhost 17.6.0 Darwin Kernel Version 17.6.0: Tue May  8 15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64 x86_64
>> docker run --rm --entrypoint="uname" alpine -a
Linux 30c60b56067d 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64 Linux

Well, that’s not what we were looking for. You can clearly see that the kernel isn’t the same. What’s actually happening is that Docker for Mac is spinning up a virtual machine. It uses the built-in macOS Hypervisor framework, allowing an application to run virtualized processes with a rather lightweight overhead. The hypervisor runs, as you can see, LinuxKit, which was created by the folks at Docker to build lightweight Linux distribution to run the Docker engine. As such, you can set, via the notification indicator menu preferences, set the VM settings, allocating the appropriate number of cores and amount of memory.

What this means is that with Docker for Mac, you do not have access to the network stack, nor do you have native file mounts. If you mount a local host directory to your container, you can expect your application to run about four times slower than if you baked the contents of that directory into the image or used a named volume.

The advantage that Docker for Mac has over the older Docker Toolbox method is that instead of having to pass commands via a TCP connection to a port on the VirtualBox instance, information is passed along a much speedier and more reliable Unix socket. See Docker’s official documentation on Docker for Mac for more details: https://docs.docker.com/docker-for-mac/docker-toolbox/

Docker for Windows

Docker for Windows operates much the same way as Docker for Mac. It utilizes Hyper-V to spin up a hardware virtualization layer and run LinuxKit. It has similar limitations to the Docker for Mac installation. Additionally, you will also have to enable file sharing in the Docker for Windows settings for the drives you want. You will also need to make sure your firewall will allow connections from the Docker virtual machine to the host Windows system. See the following links for more detail:

https://docs.docker.com/docker-for-windows/install/
https://success.docker.com/article/error-a-firewall-is-blocking-file-sharing-between-windows-and-the-containers