Lab 3
Overview
By default all files created inside a container are stored on a writable container layer. That means that:
If the container no longer exists, the data is lost,
The container's writable layer is tightly coupled to the host machine, and
To manage the file system, you need a storage driver that provides a union file system, using the Linux kernel. This extra abstraction reduces performance compared to
data volumeswhich write directly to the filesystem.
Docker provides two options to store files in the host machine: volumes and bind mounts. If you're running Docker on Linux, you can also use a tmpfs mount, and with Docker on Windows you can also use a named pipe.

Volumesare stored in the host filesystem that is managed by Docker.Bind mountsare stored anywhere on the host system.tmpfs mountsare stored in the host memory only.
Originally, the --mount flag was used for Docker Swarm services and the --volume flag was used for standalone containers. From Docker 17.06 and higher, you can also use --mount for standalone containers and it is in general more explicit and verbose than --volume.
Volumes
A data volume or volume is a directory that bypasses the Union File System of Docker.
There are three types of volumes:
anonymous volume,
named volume, and
host volume.
Anonymous Volume
Let's create an instance of a popular open source NoSQL database called CouchDB and use an anonymous volume to store the data files for the database.
To run an instance of CouchDB, use the CouchDB image from Docker Hub at https://hub.docker.com/_/couchdb. The docs say that the default for CouchDB is to write the database files to disk on the host system using its own internal volume management.
Run the following command,
CouchDB will create an anonymous volume and generated a hashed name. Check the volumes on your host system,
Set an environment variable VOLUME with the value of the generated name,
And inspect the volume that was created, use the hash name that was generated for the volume,
You see that Docker has created and manages a volume in the Docker host filesystem under /var/lib/docker/volumes/$VOLUME_NAME/_data. Note that this is not a path on the host machine, but a part of the Docker managed filesystem.
Create a new database mydb and insert a new document with a hello world message.
Stop the container and start the container again,
Retrieve the document in the database to test that the data was persisted,
Sharing Volumes
You can share an anonymous volume with another container by using the --volumes-from option.
Create a busybox container with an anonymous volume mounted to a directory /data in the container, and using shell commands, write a message to a log file.
Make sure the container busybox1 is stopped but not removed.
Then create a second busybox container named busybox2 using the --volumes-from option to share the volume created by busybox1,
Docker created the anynomous volume that you were able to share using the --volumes-from option, and created a new anonymous volume.
Cleanup the existing volumes and container.
Named Volume
A named volume and anonymous volume are similar in that Docker manages where they are located. However, a named volume can be referenced by name when mounting it to a container directory. This is helpful if you want to share a volume across multiple containers.
First, create a named volume,
Verify the volume was created,
Now create the CouchDB container using the named volume,
Wait until the CouchDB container is running and the instance is available.
Create a new database mydb and insert a new document with a hello world message.
It now is easy to share the volume with another container. For instance, read the content of the volume using the busybox image, and share the my-couchdb-data-volume volume by mounting the volume to a directory in the busybox container.
You can check the Docker managed filesystem for volumes by running a busybox container with privileged permission and set the process id to host to inspect the host system, and browse to the Docker managed directories.
Cleanup,
Host Volume
When you want to access the volume directory easily from the host machine directly instead of using the Docker managed directories, you can create a host volume.
Let's use a directory in the current working directory (indicated with the command pwd) called data, or choose your own data directory on the host machine, e.g. /home/couchdb/data. We let docker create the $(pwd)/data directory if it does not exist yet. We mount the host volume inside the CouchDB container to the container directory /opt/couchdb/data, which is the default data directory for CouchDB.
Run the following command,
Verify that a directory data was created,
and that CouchDB has created data files here,
Also check that now, no managed volume was created by docker, because we are now using a host volume.
and
Create a new database mydb and insert a new document with a hello world message.
Note that CouchDB created a folder shards,
List the content of the shards directory,
and the first shard,
A shard is a horizontal partition of data in a database. Partitioning data into shards and distributing copies of each shard to different nodes in a cluster gives the data greater durability against node loss. CouchDB automatically shards databases and distributes the subsets of documents among nodes.
Cleanup,
Bind Mounts
The mount syntax is recommended by Docker over the volume syntax. Bind mounts have limited functionality compared to volumes. A file or directory is referenced by its full path on the host machine when mounted into a container. Bind mounts rely on the host machine’s filesystem having a specific directory structure available and you cannot use the Docker CLI to manage bind mounts. Note that bind mounts can change the host filesystem via processes running in a container.
Instead of using the -v syntax with three fields separated by colon separator (:), the mount syntax is more verbose and uses multiple key-value pairs:
type: bind, volume or tmpfs,
source: path to the file or directory on host machine,
destination: path in container,
readonly,
bind-propagation: rprivate, private, rshared, shared, rslave, slave,
consistency: consistent, delegated, cached,
mount.
[Optional] OverlayFS
OverlayFS is a union mount filesystem implementation for Linux. To understand what a Docker volume is, it helps to understand how layers and the filesystem work in Docker.
To start a container, Docker takes the read-only image and creates a new read-write layer on top. To view the layers as one, Docker uses a Union File System or OverlayFS (Overlay File System), specifically the overlay2 storage driver.
To see Docker host managed files, you need access to the Docker process file system. Using the --privileged and --pid=host flags you can access the host's process ID namespace from inside a container like busybox. You can then browse to Docker's /var/lib/docker/overlay2 directory to see the downloaded layers that are managed by Docker.
To view the current list of layers in Docker,
Pull down the ubuntu image and check again,
You see that pulling down the ubuntu image, implicitly pulled down 4 new layers,
a611792b4cac502995fa88a888261dfba0b5d852e72f9db9e075050991423779
d181f1a41fc35a45c16e8bfcb8eee6f768f3b98f82210a43ea65f284a45fcd65
dac2f37f6280a076836d39b87b0ae5ebf5c0d386b6d8b991b103aadbcebaa7c6
f3e921b440c37c86d06cd9c9fb70df50edad553c36cc87f84d5eeba734aae709
The overlay2 storage driver in essence layers different directories on the host and presents them as a single directory.
base layer or lowerdir,
difflayer or upperdir,overlay layer (user view), and
workdir.
OverlayFS refers to the lower directories as lowerdir, which contains the base image and the read-only (R/O) layers that are pulled down.
The upper directory is called upperdir and is the read-write (R/W) container layer.
The unified view or overlay layer is called merged.
Finally, a workdir is a required, which is an empty directory used by overlay for internal use.
The overlay2 driver supports up to 128 lower OverlayFS layers. The l directory contains shortened layer identifiers as symbolic links.

Cleanup,
Last updated