Dockerfiles
A Dockerfile is a set of instructions that tells Docker how to build an image. It defines the starting point and what specific things to do to configure the image. Let's take a look at a sample Dockerfile and walk through what it is doing. Listing 1 shows the Dockerfile for the base CentOS image.
Listing 1. CentOS Dockerfile
FROM scratch
MAINTAINER The CentOS Project <cloud-ops@centos.org> - ami_creator
ADD centos-7-20150616_1752-docker.tar.xz /
# Volumes for systemd
# VOLUME ["/run", "/tmp"]
# Environment for systemd
# ENV container=docker
# For systemd usage this changes to /usr/sbin/init
# Keeping it as /bin/bash for compatibility with previous
CMD ["/bin/bash"]
You'll note that most of this Dockerfile is comments. It has four specific instructions:
FROM scratch
: All Dockerfiles are derived from a base image. In this case, the CentOS image is derived from "scratch," which is the root of all images. Essentially this is a no-op, which indicates that it is one of Docker's root images.MAINTAINER ...
: TheMAINTAINER
directive identifies the owner of the image. In this case the owner is the CentOS Project.ADD centos...tar.xz
: TheADD
directive tells Docker to upload the specified file to the image and, if it is compressed, to decompress it to the specified location. In this example, Docker is going to upload a GZipped TAR file of the CentOS operating system and decompress it to the root of the file system (/
).CMD ["/bin/bash"]
: Finally, theCMD
directive tells Docker what command to execute. In this case, it is going to create a Bourne Again Shell (bash).
Now that you have a sense of what a Dockerfile looks like, let's explore the official Tomcat Dockerfile. Figure 2 shows this file's hierarchy.
Figure 2. Tomcat Dockerfile hierarchy
This hierarchy might not be as simple as you would have anticipated, but when we take it apart piece-by-piece it is actually quite logical. You already know that the root of all Dockerfiles is scratch
, so seeing that as the root is no surprise. The first meaningful Dockerfile image is debian:jessie
. The Docker official images are built from build packs, or standard images. This means that Docker does not need to reinvent the wheel every time it creates a new image, but rather it has a solid base from which to start building new images. In this case, debian:jessie
is simply a Debian Linux image installed just like the CoreOS that we just looked at. It includes the three lines in listing 2.
Listing 2. debian:jessie Dockerfile
FROM scratch
ADD rootfs.tar.xz /
CMD ["/bin/bash"]
On top of that we see two additional installations: CURL and Source Code Management. The Dockerfile for buildpack-deps:jessie-curl
is shown in Listing 3.
Listing 3. buildpack-deps:jessie-curl Dockerfile
FROM debian:jessie
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
wget \
&& rm -rf /var/lib/apt/lists/*
This Dockerfile uses apt-get
to install curl
and wget
so that the image will be able to download software from other servers. The RUN
directive tells Docker to execute the specified command on the running instance. In this case it updates all libraries (apt-get update
) and then executes the apt-get install
to download and install curl
and wget
.
The Dockerfile for buildpack-deps:jessie-scp
is shown in Listing 4.
Listing 4. buildpack-deps:jessie-scp Dockerfile
FROM buildpack-deps:jessie-curl
RUN apt-get update && apt-get install -y --no-install-recommends \
bzr \
git \
mercurial \
openssh-client \
subversion \
&& rm -rf /var/lib/apt/lists/*
This Dockerfile installs source code management tools, such as Git, Mercurial, and Subversion, following the same model as the jessie-curl
Dockerfile we just looked at.
The Java Dockerfile is a little more complicated; it is shown in Listing 5.
Listing 5. Java Dockerfile
FROM buildpack-deps:jessie-scm
# A few problems with compiling Java from source:
# 1. Oracle. Licensing prevents us from redistributing the official JDK.
# 2. Compiling OpenJDK also requires the JDK to be installed, and it gets
# really hairy.
RUN apt-get update && apt-get install -y unzip && rm -rf /var/lib/apt/lists/*
RUN echo 'deb http://httpredir.debian.org/debian jessie-backports main' > /etc/apt/sources.list.d/jessie-backports.list
# Default to UTF-8 file.encoding
ENV LANG C.UTF-8
ENV JAVA_VERSION 8u66
ENV JAVA_DEBIAN_VERSION 8u66-b01-1~bpo8+1
# see https://bugs.debian.org/775775
# and https://github.com/docker-library/java/issues/19#issuecomment-70546872
ENV CA_CERTIFICATES_JAVA_VERSION 20140324
RUN set -x \
&& apt-get update \
&& apt-get install -y \
openjdk-8-jdk="$JAVA_DEBIAN_VERSION" \
ca-certificates-java="$CA_CERTIFICATES_JAVA_VERSION" \
&& rm -rf /var/lib/apt/lists/*
# see CA_CERTIFICATES_JAVA_VERSION notes above
RUN /var/lib/dpkg/info/ca-certificates-java.postinst configure
# If you're reading this and have any feedback on how this image could be
# improved, please open an issue or a pull request so we can discuss it!
Essentially, this Dockerfile runs apt-get install -y openjdk-8-jdk
to download and install Java, with some added configuration to install it securely. The ENV
directive sets system environment variables that will be needed during the installation.
Finally, Listing 6 shows the Tomcat Dockerfile.
Listing 6. Tomcat Dockerfile
FROM java:7-jre
ENV CATALINA_HOME /usr/local/tomcat
ENV PATH $CATALINA_HOME/bin:$PATH
RUN mkdir -p "$CATALINA_HOME"
WORKDIR $CATALINA_HOME
# see https://www.apache.org/dist/tomcat/tomcat-8/KEYS
RUN gpg --keyserver pool.sks-keyservers.net --recv-keys \
05AB33110949707C93A279E3D3EFE6B686867BA6 \
07E48665A34DCAFAE522E5E6266191C37C037D42 \
47309207D818FFD8DCD3F83F1931D684307A10A5 \
541FBE7D8F78B25E055DDEE13C370389288584E7 \
61B832AC2F1C5A90F0F9B00A1C506407564C17A3 \
79F7026C690BAA50B92CD8B66A3AD3F4F22C4FED \
9BA44C2621385CB966EBA586F72C284D731FABEE \
A27677289986DB50844682F8ACB77FC2E86E29AC \
A9C5DF4D22E99998D9875A5110C01C5A2F6059E7 \
DCFD35E0BF8CA7344752DE8B6FB21E8933C60243 \
F3A04C595DB5B6A5F1ECA43E3B7BBB100D811BBE \
F7DA48BB64BCB84ECBA7EE6935CD23C10D498E23
ENV TOMCAT_MAJOR 8
ENV TOMCAT_VERSION 8.0.26
ENV TOMCAT_TGZ_URL https://www.apache.org/dist/tomcat/tomcat-$TOMCAT_MAJOR/v$TOMCAT_VERSION/bin/apache-tomcat-$TOMCAT_VERSION.tar.gz
RUN set -x \
&& curl -fSL "$TOMCAT_TGZ_URL" -o tomcat.tar.gz \
&& curl -fSL "$TOMCAT_TGZ_URL.asc" -o tomcat.tar.gz.asc \
&& gpg --verify tomcat.tar.gz.asc \
&& tar -xvf tomcat.tar.gz --strip-components=1 \
&& rm bin/*.bat \
&& rm tomcat.tar.gz*
EXPOSE 8080
CMD ["catalina.sh", "run"]
Technically, Tomcat uses a Java 7 parent Dockerfile (the default or "latest" version of Java is 8, which is shown in Listing 5). This Dockerfile sets up the CATALINA_HOME
and PATH
environment variables using the ENV
directive. It then creates the CATALINA_HOME
directory by running the mkdir
command. The WORKDIR
directive changes the working directory to CATALINA_HOME
. The RUN
command executes a host of different commands in a single line:
- Download the Tomcat GZipped TAR file.
- Download the checksum for the file.
- Verify that the Tomcat TAR file download was successful.
- Decompress the Tomcat TAR file.
- Remove all batch files (we're running in Linux after all).
- Remove the Tomcat file that we downloaded (since we've already decompressed it).
Defining all of these instructions in one command means that Docker sees a single instruction, and caches the resulting image as such. Docker has a strategy for detecting when images need to be rebuilt and each instruction is verified during the build process. It caches the result of each step as an optimization, so that if the last step in a Dockerfile changes, Docker is able to start from a mostly complete image and just apply that last step. The downside of specifying all of these commands in one line is that if any one of the commands change, the whole Docker image will need to be rebuilt.
The EXPOSE
directive tells Docker to expose a particular port on the Docker container when it starts. As you saw when we launched this image earlier, we needed to tell Docker what physical port to map to the container port (the -p
argument), and the EXPOSE
directive is the mechanism for defining available Docker instance ports. Finally, the Dockerfile starts Tomcat by executing the catalina.sh
command (which is in the PATH
) with the run
argument.
Quick recap
Building the Dockerfile from inception to Tomcat was a long process, so let me summarize all the steps so far:
- Install Debian Linux.
- Add
curl
andwget
. - Add source code management tools in case you need them.
- Download and install Java.
- Download and install Tomcat.
- Expose port
8080
on the Docker instance. - Run Tomcat by executing
catalina.sh run
.
At this point you should be a Dockerfile expert -- or at least somewhat dangerous! Next we'll try building a custom Docker image that contains our application.
Deploying a custom application to Docker
Because this tutorial is more about deploying a Java application to Docker and less about the actual Java application, I have built a very simple Hello World servlet. You can access the project on GitHub. The source code is nothing special; it's just a servlet that outputs "Hello, World!" What's more interesting is the accompanying Dockerfile, shown in Listing 7.
Listing 7. Dockerfile for the Hello World servlet
FROM tomcat
ADD deploy /usr/local/tomcat/webapps
It might not look like much, but you should already understand what the code in Listing 7 is doing:
FROM tomcat
indicates that this Dockerfile is based on the Tomcat image we explored in the previous section.ADD deploy /usr/local/tomcat/webapps
tells Docker to copy all the files in the local "deploy" directory to/usr/local/tomcat/webapps
, which is the home for web applications on this Tomcat image.
Clone the project locally and build it using the Maven command:
mvn clean install
This will create the file, target/helloworld.war
. Copy that file to the project's docker/deploy
directory (which you will need to create). Finally, you need to build the Docker image from the Dockerfile. Execute the following command from the project's docker
directory (which already contains the Dockerfile in Listing 7):
docker build -t lygado/docker-tomcat .
This command tells Docker to build a new image from the current working directory, which is indicated by the dot (.
) with the tag (-t
): lygado/docker-tomcat
. In this case, lygado
is my DockerHub username and docker-tomcat
is the image name (you'll want to specify your own username). To see that the image has been built you can execute the docker images
command:
$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
lygado/docker-tomcat latest ccb455fabad9 42 seconds ago 849.5 MB
Finally, you can launch the image with the docker run
command:
docker run -d -p 8080:8080 lygado/docker-tomcat
After the instance starts, you can access it through the following URL (be sure to substitute the IP Address of the VirtualBox instance on your machine):
http://192.168.99.100:8080/helloworld/hello
Again, you can stop this Docker container instance with the docker stop INSTANCE_ID
command.
Docker push
Once you have built and tested your Docker image, you can push that image to your DockerHub account with the following command:
docker push lygado/docker-tomcat
This makes your image available to the world (or your private DockerHub repository if you opt for privacy), as well as to any automated provisioning tools you will ultimately use to deploy containers to production.
Next we'll integrate Docker into our build process, so that it can produce a Docker image that includes our application.
Integrating Docker into your Maven build process
In the previous section we created a custom Dockerfile and deployed our WAR file to it. This meant copying the WAR file from our project's target
directory to the docker/deploy
directory and running docker from the command line. It wasn't much work, but if you are actively developing and want to modify your code and test it immediately afterwards, you might find this process tedious. Furthermore, if you want to run your build from a continuous integration (CI) server and output a runnable Docker image, then you are going to figure out how to integrate Docker with your CI tool.
So let's explore a more efficient process, where we build (and optionally run) a Docker image from Maven using a Maven Docker plug-in.
If you search the Internet you will find several Maven plugins that integrate with Docker, but for the purpose of this article I chose to use the following plug-in: rhuss/docker-maven-plugin. While it isn't an exhaustive comparison of all scenarios and plug-in providers, author Roland Huss provides a pretty good comparison of several contenders. I encourage you to read it, to help guide your decision about the best Maven Docker plug-in for your use case.
My use cases were:
- To be able to create a Tomcat-based Docker image that hosted my application.
- To be able to run it from my build for my own tests.
- To be able to integrate it with pre- and post- integration test Maven phases for automated tests.
The docker-maven-plugin
accomplished these tasks, was very easy to use, and was easy to understand.
About the Maven Docker plugin
The plug-in itself is well documented, but as an overview it really consists of two main configuration components:
- Docker image build and run configurations in your POM.xml file.
- An assembly description of files to include in your image.
The build and run image configuration is defined in the plugins
section of your POM file, shown in Listing 8.
Listing 8. POM file build section for Docker Maven plug-in configuration
<build>
<finalName>helloworld</finalName>
<plugins>
<plugin>
<groupId>org.jolokia</groupId>
<artifactId>docker-maven-plugin</artifactId>
<version>0.13.4</version>
<configuration>
<dockerHost>tcp://192.168.99.100:2376</dockerHost>
<certPath>/Users/shaines/.docker/machine/machines/default</certPath>
<useColor>true</useColor>
<images>
<image>
<name>lygado/tomcat-with-my-app:0.1</name>
<alias>tomcat</alias>
<build>
<from>tomcat</from>
<assembly>
<mode>dir</mode>
<basedir>/usr/local/tomcat/webapps</basedir>
<descriptor>assembly.xml</descriptor>
</assembly>
</build>
<run>
<ports>
<port>8080:8080</port>
</ports>
</run>
</image>
</images>
</configuration>
</plugin>
</plugins>
</build>
As you can see, this configuration is pretty simple and consists of the following categories of elements:
Plug-in definition
The groupId
, artifactId
, and version
identify the plug-in to use.
Global configuration
The dockerHost
and certPath
elements define the location of your Docker host, which is emitted when you start Docker, and the location of your Docker certificates. The Docker certificate file is available in your DOCKER_CERT_PATH
environment variable.
Image configuration
All images in your build are defined as image
child elements under your images
element. An image
element has image-specific configuration, as well as build
and run
configuration (explained below). The core image-specific element is the name
of the image you want to build. In our case this is my DockerHub username (lygado), the name of the image (tomcat-with-my-app
), and the version of the Docker image (0.1
). Note that you can use a Maven property for any of these values.
Image build configuration
When you build an image, as we did with the docker build
command, you needed a Dockerfile that defines how to build it. The Maven Docker plug-in allows you to use a Dockerfile, but in this example we are going to build the Docker image from a Dockerfile that is built on-the-fly and resident in memory. Therefore, we specify the parent image in the from
element, which in this case is tomcat
, then we reference an assembly configuration.
The maven-assembly-plugin, provided by Maven, defines a common structure for aggregating a project's output with its dependencies, modules, site documentation, and other files into a single distributable archive, and the docker-maven-plugin
leverages this standard. In this example, we opt to use the dir
mode, which means that the files defined in the src/main/docker/assembly.xml
file should be copied directly to the basedir
on the Docker image. Other modes include Tar (tar
), GZipped Tar (tgz
), and Zipped (zip
). The basedir
specifies where the files should be placed in the Docker image, which in this case is Tomcat's webapps
directory.
Finally, the descriptor
tells the plug-in the name of the assembly descriptor, which will be located in the src/main/docker
directory. This is a very simplistic example, so I encourage you to read through the documentation. In particular, you will want to familiarize yourself with the entrypoint
and cmd
elements, which allow you to specify the command to start the Docker image, the env
element to specify environment variables, the runCmds
to execute commands just as you would in a proper Dockerfile, workdir
to change the working directory, and volumes
if you want to mount external volumes. In short, this plug-in exposes everything that you need to be able to build a Dockerfile, which means that all of the Dockerfile directives specified earlier in the article are all available through the plug-in.
Image run configuration
When you run a Docker image using the docker run
command, you may pass a collection of arguments to Docker. In this example we start our Docker instance with a command like: docker run -d -p 8080:8080 lygado/tomcat-with-my-app:0.1
, so essentially we only need to specify our port mapping.
The run
element allows us to specify all of our runtime arguments, so we specify that we should map port 8080 on our Docker container to port 8080 on our Docker host. Additionally, the run
section allows us to specify volumes to bind (using the volumes
element) and containers to link together (using the links
element). Because the docker:start
command is often used with integration tests, the run
section includes the wait
argument, which allows us to wait for a specified period of time, for a log entry, or for a URL to become available before continuing execution. This ensures that our image is running before we launch any integration tests.
Loading dependencies
The src/main/docker/assembly.xml
file defines the files (or file in this case) that we want to copy to the Docker image. It is shown in Listing 9.
Listing 9. assembly.xml
<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2 http://maven.apache.org/xsd/assembly-1.1.2.xsd">
<dependencySets>
<dependencySet>
<includes>
<include>com.geekcap.vmturbo:hello-world-servlet-example</include>
</includes>
<outputDirectory>.</outputDirectory>
<outputFileNameMapping>helloworld.war</outputFileNameMapping>
</dependencySet>
</dependencySets>
</assembly>
In Listing 9 we see a dependency set that includes our hello-world-servlet-example
artifact, and we see see that we want to output it to the .
directory. Recall that in our POM file we defined a basedir
, which specified the location of Tomcat's webapps
directory; the outputDirectory
is relative to that base directory. So in other words we want to deploy the hello-world-servlet-example
artifact to Tomcat's webapps
directory.
The plug-in defines a set of Maven targets:
docker:build
: used to build an imagedocker:start
: used to start a containerdocker:stop
: used to stop a containerdocker:push
: used to push an image to a repository, such as DockerHubdocker:remove
: used to delete an image from the local Docker hostdocker:logs
: used to show a container's logs
Build the Docker image
You can access the source code for the example application from GitHub, then build it as follows:
mvn clean install
To build a Docker image you can execute the following command:
mvn clean package docker:build
Once the image is built, you can see it in Docker using the docker images
command:
$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
lygado/tomcat-with-my-app 0.1 1d49e6924d19 16 minutes ago 347.7 MB
You can run the container with the following command:
mvn docker:start
Now you should see it running with the docker ps
command. It should be accessible through the same URL defined above:
http://192.168.99.100:8080/helloworld/hello
Finally, you can stop the container with the following command:
mvn docker:stop
Conclusion
Docker is a container technology that virtualizes processes more than it does machines. It consists of a Docker client process that communicates with a Docker daemon running on a Docker host. On Linux, the Docker daemon runs directly as a process on the Linux operating system, whereas on Windows and Mac it starts a Linux virtual machine in VirtualBox and runs the Docker daemon there. Docker images contain a very lightweight operating system footprint, in addition to whatever libraries and binaries you need to run your application. Docker images are driven by a Dockerfile, which defines the instructions necessary to configure an image.
In this Open source Java projects tutorial I've introduced Docker fundamentals, reviewed the details of the Dockerfiles for CentOS, Java, and Tomcat, and showed you how to build a Dockerfile from Tomcat. Finally, we integrated the docker-maven-plugin
into a build process and we built a Docker image from Maven. Being able to build and run a Docker container as part of our build makes testing easier, and it also allows us to build Docker images from a CI server that is ready for production deployment.
The example application for this article was very simple, but the steps you've learned are applicable to more complex enterprise applications. Happy docking!
This story, "Redefine Java virtualization with Docker" was originally published by JavaWorld.