Machine learning for Java developers, Part 2: Deploy your model

How to build and deploy a machine learning data model in a Java-based production environment

1 2 3 Page 3
Page 3 of 3

Listing 7. Bash script to build a RESTful machine learning data pipeline


mkdir build
cd build

echo task 1: copying framework-rest source to local dir
git clone --quiet -b
cd ml_deploy/module-pipeline-rest

echo task 2: download trained pipeline to pipeline-rest/src/main/resources dir
mkdir src/main/resources
curl -s -L $pipeline_instance_uri --output src/main/resources/$pipeline_instance
echo "filename: $pipeline_instance" > src/main/resources/application.yml

echo task 3: adding the pipeline artefact id to framework-rest pom.xml file
new_pom=${pom/"<!-- PLACEHOLDER -->"/$additional_dependency}
echo $new_pom > pom.xml

echo task 4: build rest server jar including the specific pipeline artifacts
mvn -q clean install package

echo task 5: copying the newly created jar file into the root of the build dir
cp target/pipeline-rest-1.0.3.jar ../../$server_jar
cd ../../..

echo task 6: build docker image
docker build --build-arg arg_server_jar=$server_jar -t $groupId"/"$artifactId":"$version"-"$timestamp .

rm -rf build

Machine learning with Docker containers

Although the newly created executable server JAR is a deployable and runnable artifact, devops and system administrators often prefer Docker containers over executable JARs. Essentially, a Docker container can be seen as a customized software stack including a virtual operating system running on the top of a host operating system. This allows you to package up an application with all of its required parts, system components, and configurations. In contrast to traditional virtual machine solutions, a Docker container uses the same kernel as the host system that it runs on, which reduces the overhead of virtualization.

jw grothml2 fig9 Gregor Roth

Figure 9. Running the Docker container image

For instance, you could create a Docker container image including a slim Debian Linux distribution, the newest OpenJDK runtime, as well as your executable server JAR. In contrast to a JAR-based deployment, Docker makes it easy to implement a customized configuration such as specific JVM garbage collector settings or to install custom certificates as part of your deployment unit. Instead of delivering an executable JAR with a more or less large list of installation prerequisite, you would provide a self-contained Docker container without having to install anything else.

To assemble a new Docker container image, you have to define a DOCKERFILE containing a collection of Docker commands instructing Docker as to how it should build your image. In the example below, a new Docker image will be built based on an OpenJDK/buster base image, including the Debian Linux distribution and OpenJDK 13. With the exception of the last command, all commands will be executed at the Docker image build time. Essentially, the DOCKERFILE copies the server JAR file located in the build directory into the container's file system. Assuming the Docker container has been started, the last command will run the Java-based REST service.

Listing 8. DOCKERFILE to build the machine learning data pipeline

FROM openjdk:13-jdk-slim-buster

# build time params (provided by 'docker build --build-arg arg_server_jar=server-pipeline-estimate-houseprice-1.0.3-1568611516.jar')
ARG arg_server_jar

# copy the executable server jar file into the docker container
COPY build/$server_jar /etc/restserver/$server_jar

# copy the build time param to a runtime param (required for runtime command CMD below)
ENV server_jar=$arg_server_jar

# default command, which will be executed on runtime by performing 'docker run'
CMD java -jar /etc/restserver/$server_jar

By executing the image build process, Docker will read the DOCKERFILE in the local directory. In the example below a Docker container image will be created and tagged with a unique Docker identifier: By default, the newly created Docker image is stored into your local Docker environment.

docker build --build-arg arg_server_jar= server-pipeline-estimate-houseprice-1.0.3-1568611516.jar -t 

The docker run command will be used as shown below, to load the image and start the container.

docker run -p 9090:8080

In most cases, additional environment parameters such as the -p parameter will be set. The -p parameter is used to make the 8080 port of the Java server inside the Docker container available to services outside of the container. In this example, port 9090 of the host system will be mapped to Docker's internal Java server port 8080.

Additional parameters limit the resource consumption of the Docker container. For instance, the -m parameter limits the container's access to memory. Typically, such resource limiting parameters will be used to implement a Bulkhead stability pattern. The Bulkhead pattern helps to protect systems against cascading errors. For instance, a buggy Java server inside the container may start to consume more and more memory and CPU power. If the consumption is limited by using Docker's resource parameters, other containers running on the same host will not be negatively affected by running out of memory or waiting for CPU time.


This tutorial has introduced a generalized process for training, deploying, and processing a machine learning model in a production environment. In practice, numerous requirements and conditions will weigh on the approach you use to put a machine learning model into production. Depending on your business requirements, machine learning models may have to be executed using a real-time solution such as a streams-based architecture, or a batch-oriented architecture that prioritizes throughput for heavy data loads. Additional factors to consider are the communication patterns, which may favor a database/filesystem-based pipeline API, a streams-based pipeline API, or a REST-based pipeline API. The whole pipeline may be packaged as a single deployment unit, or parts of the preprocessing components may be packaged as dedicated deployment units. Furthermore, the pipeline may be deployed as a self-contained Docker container, or you could use a central model repository, serving nodes to load and process models in a dynamic way like TensorFlow Serving does.

In contrast to traditional software development, all of these approaches require that you handle an additional dimension of complexity. In traditional programming, you hardcode the behavior of the program. In a machine learning pipeline, you also write code, but the code you write will be trained and adjusted based on production data, which adapts the behavior of the program. In contrast to traditional programming the unit of deployment is a trained frozen instance, which makes deployment and software maintenance more complex. Key to handling this additional dimension of complexity is to make things reproducible. With comprehensive version and release management, you will be able to re-train and re-deploy a pipeline instance, such that given the same raw data as input it will return the exact same output. The gives you the ability to deploy and run your machine learning pipelines in production environments in a controllable, transparent, and maintainable manner.

This story, "Machine learning for Java developers, Part 2: Deploy your model" was originally published by JavaWorld.

Copyright © 2019 IDG Communications, Inc.

1 2 3 Page 3
Page 3 of 3