Building and testing the custom Python algorithm container image
In this recipe, we will prepare a Dockerfile for the custom Python container image. We will make use of the train and serve scripts that we prepared in the previous recipes. After that, we will run the docker build command to prepare the image before pushing it to an Amazon ECR repository.
Tip
Wait! What's a Dockerfile? It's a text document containing the directives (commands) used to prepare and build a container image. This container image then serves as the blueprint when running containers. Feel free to check out https://docs.docker.com/engine/reference/builder/ for more information on Dockerfiles.
Getting ready
Make sure you have completed the Preparing and testing the serve script in Python recipe.
How to do it…
The initial steps in this recipe focus on preparing a Dockerfile. Let's get started:
- Double-click the
Dockerfilefile in the file tree to open it in the Editor pane. Make sure that this is the sameDockerfilethat's inside theml-pythondirectory:
Figure 2.55 – Opening the Dockerfile inside the ml-python directory
Here, we can see a
Dockerfileinside theml-pythondirectory. Remember that we created an emptyDockerfilein the Setting up the Python and R experimentation environments recipe. Clicking it in the file tree should open an empty file in the Editor pane:
Figure 2.56 – Empty Dockerfile in the Editor pane
Here, we have an empty
Dockerfile. In the next step, we will update this by adding three lines of code. - Update
Dockerfilewith the following block of configuration code:FROM arvslat/amazon-sagemaker-cookbook-python-base:1 COPY train /usr/local/bin/train COPY serve /usr/local/bin/serve
Here, we are planning to build on top of an existing image called
amazon-sagemaker-cookbook-python-base. This image already has a few prerequisites installed. These include theFlask,pandas, andScikit-learnlibraries so that you won't have to worry about getting the installation steps working properly in this recipe. For more details on this image, check out https://hub.docker.com/r/arvslat/amazon-sagemaker-cookbook-python-base:
Figure 2.57 – Docker Hub page for the base image
Here, we can see the Docker Hub page for the amazon-sagemaker-cookbook-python-base image.
Tip
You can access a working copy of this
Dockerfilein the Machine Learning with Amazon SageMaker Cookbook GitHub repository: https://github.com/PacktPublishing/Machine-Learning-with-Amazon-SageMaker-Cookbook/blob/master/Chapter02/ml-python/serve.With the
Dockerfileready, we will proceed with using the Terminal until the end of this recipe: - You can use a new Terminal tab or an existing one to run the next set of commands:

Figure 2.58 – New Terminal
Here, we can see how to create a new Terminal. Note that the Terminal pane is under the Editor pane in the AWS Cloud9 IDE.
- Navigate to the
ml-pythondirectory containing ourDockerfile:cd /home/ubuntu/environment/opt/ml-python
- Specify the image name and the tag number:
IMAGE_NAME=chap02_python TAG=1
- Build the Docker container using the
docker buildcommand:docker build --no-cache -t $IMAGE_NAME:$TAG .
The
docker buildcommand makes use of what is written inside ourDockerfile. We start with the image specified in theFROMdirective and then we proceed by copying the file into the container image. - Use the
docker runcommand to test if thetrainscript works:docker run --name pytrain --rm -v /opt/ml:/opt/ml $IMAGE_NAME:$TAG train
Let's quickly discuss some of the different options that were used in this command. The
--rmflag makes Docker clean up the container after the container exits. The-vflag allows us to mount the/opt/mldirectory from the host system to the/opt/mldirectory of the container:
Figure 2.59 – Result of the docker run command (train)
Here, we can see the results after running the
docker runcommand. It should show logs similar to what we had in the Preparing and testing the train script in Python recipe. - Use the
docker runcommand to test if theservescript works:docker run --name pyserve --rm -v /opt/ml:/opt/ml $IMAGE_NAME:$TAG serve
After running this command, the Flask API server starts successfully. We should see logs similar to what we had in the Preparing and testing the serve script in Python recipe:

Figure 2.60 – Result of the docker run command (serve)
Here, we can see that the API is running on port
8080. In the base image we used, we addedEXPOSE 8080to allow us to access this port in the running container. - Open a new Terminal tab:

Figure 2.61 – New Terminal
As the API is running already in the first Terminal, we have created a new one.
- In the new Terminal tab, run the following command to get the IP address of the running Flask app:
SERVE_IP=$(docker network inspect bridge | jq -r ".[0].Containers[].IPv4Address" | awk -F/ '{print $1}') echo $SERVE_IPWe should get an IP address that's equal or similar to
172.17.0.2. Of course, we may get a different IP address value. - Next, test the ping endpoint URL using the
curlcommand:curl http://$SERVE_IP:8080/ping
We should get an
OKafter running this command. - Finally, test the
invocationsendpoint URL using thecurlcommand:curl -d "1" -X POST http://$SERVE_IP:8080/invocations
We should get a value similar or close to
881.3428400857507after invoking theinvocationsendpoint.
At this point, it is safe to say that the custom container image we have prepared in this recipe is ready. Now, let's see how this works!
How it works…
In this recipe, we built a custom container image using the Dockerfile configuration we specified. When you have a Dockerfile, the standard set of steps would be to use the docker build command to build the Docker image, authenticate with ECR to gain the necessary permissions, use the docker tag command to tag the image appropriately, and use the docker push command to push the Docker image to the ECR repository.
Let's discuss what we have inside our Dockerfile. If this is your first time hearing about Dockerfiles, they are simply text files containing commands to build the image. In our Dockerfile, we did the following:
- We used
arvslat/amazon-sagemaker-cookbook-python-baseas the base image. Check out https://hub.docker.com/repository/docker/arvslat/amazon-sagemaker-cookbook-python-base for more details about this image. - We copied the
trainandservescripts to the/usr/local/bindirectory inside the container image. These scripts are executed when we usedocker run.
Using the arvslat/amazon-sagemaker-cookbook-python-base image as the base image allowed us to write a shorter Dockerfile that focuses only on copying the train and serve files to the directory inside the container image. Behind the scenes, we have already pre-installed the flask, pandas, scikit-learn, and joblib packages, along with their prerequisites, inside this container image so that we will not run into issues when building the custom container image. Here is a quick look at the Dockerfile file we used as the base image that we are using in this recipe:
FROM ubuntu:18.04 RUN apt-get -y update RUN apt-get install -y python3.6 RUN apt-get install -y --no-install-recommends python3-pip RUN apt-get install -y python3-setuptools RUN ln -s /usr/bin/python3 /usr/bin/python & \ ln -s /usr/bin/pip3 /usr/bin/pip RUN pip install flask RUN pip install pandas RUN pip install scikit-learn RUN pip install joblib WORKDIR /usr/local/bin EXPOSE 8080
In this Dockerfile, we can see that we are using Ubuntu:18.04 as the base image. Note that we can use other base images as well, depending on the libraries and frameworks we want to be installed in the container image.
Once we have the container image built, the next step will be to test if the train and serve scripts will work inside the container once we use docker run. Getting the IP address of the running container may be the trickiest part, as shown in the following block of code:
SERVE_IP=$(docker network inspect bridge | jq -r ".[0].Containers[].IPv4Address" | awk -F/ '{print $1}')
We can divide this into the following parts:
docker network inspect bridge: This provides detailed information about the bridge network in JSON format. It should return an output with a structure similar to the following JSON value:[ { ... "Containers": { "1b6cf4a4b8fc5ea5...": { "Name": "pyserve", "EndpointID": "ecc78fb63c1ad32f0...", "MacAddress": "02:42:ac:11:00:02", "IPv4Address": "172.17.0.2/16", "IPv6Address": "" } }, ... } ]jq -r ".[0].Containers[].IPv4Address": This parses through the JSON response value fromdocker network inspect bridge. Piping this after the first command would yield an output similar to172.17.0.2/16.awk -F/ '{print $1}': This splits the result from thejqcommand using the/separator and returns the value before/. After getting theAA.BB.CC.DD/16value from the previous command, we getAA.BB.CC.DDafter using theawkcommand.
Once we have the IP address of the running container, we can ping the /ping and /invocations endpoints, similar to how we did in the Preparing and testing the serve script in Python recipe.
In the next recipes in this chapter, we will use this custom container image when we do training and deployment with the SageMaker Python SDK.