Welcome to the applied AI deep-learning team, and to our first project—Building a Common Deep Learning Environment! We're excited about the projects we've assembled in this book. The foundation of a common working environment will help us work together and learn very cool and powerful deep learning (DL) technologies, such as computer vision (CV) and natural language processing (NLP), that you will be able to use in your professional career as a data scientist.
The following topics will be covered in this chapter:
- Components in building a common DL environment
- Setting up a local DL environment
- Setting up a DL environment in the cloud
- Using the cloud for deployment for DL applications
- Automating the setup process to reduce errors and get started quickly
Our main goal to achieve by the end of this chapter is to standardize the toolsets to work together and achieve consistently accurate results.
In the process of building applications using DL algorithms that can also scale for production, it's very important to have the right kind of setup, whether local or on the cloud, to make things work end to end. So, in this chapter, we will learn how to set up a DL environment that we will be using to run all the experiments and finally take the AI models into production.
The following is the list of required components that we need to build DL applications:
- Ubuntu 16.04 or greater
- Anaconda Package
- Python 2.x/3.x
- TensorFlow/Keras DL packages
- CUDA for GPU support
- Gunicorn for deployment at scale
We'll start by setting up your local DL environment. Much of the work that you'll do can be done on local machines. But with large datasets and complex model architectures, processing time slows down dramatically. This is why we are also setting up a DL environment in the cloud, because the processing time for these complex and repetitive calculations just becomes too long to be able to efficiently get things done otherwise.
We will work straight through the preceding list, and by the end (and with the help of a bit of automated script), you'll have everything set up!
Throughout this book, we will be using Ubuntu OS to run all the experiments, because there is great community support for Linux and mostly any DL application can be set up easily on Linux. For any assistance on installation and setup related to Ubuntu, please refer to the tutorials at https://tutorials.ubuntu.com/. On top of that, this book will use the Anaconda package with Python 2.7+ to write our code, train, and test. Anaconda comes with a huge list of pre-installed Python packages, such as numpy, pandas, sklearn, and so on, which are commonly used in all kinds of data science projects.
Anaconda is a generic bundle that contains iPython Notebook, editor, and lots of Python libraries preinstalled, which saves a lot of time on setting up everything. With Anaconda, we can quickly get started on solving the data science problem, instead of configuring the environment.
But, yes, you can use the default Python—it's totally the reader's choice, and we will learn at the end of this chapter how to configure python env using script
Anaconda is a very popular data science platform for people using Python to build machine learning and DL models, and deployable applications. The Anaconda marketing team put it best on their What is Anaconda? page, available at https://www.anaconda.com/what-is-anaconda/. To install Anaconda, perform the following steps:
- Click Anaconda on the menu, then click Downloads to go to the download page at https://www.anaconda.com/download/#linux
- Choose the download suitable for your platform (Linux, OS X, or Windows):
- Choose Python 3.6 version*
- Choose the Graphical Installer
- Follow the instructions on the wizard, and in 10 to 20 minutes, your Anaconda environment (Python) setup will be ready
Once the installation process is completed, you can use following command to check the Python version on your Terminal:
You should see the following output:
Python 3.6 :: Anaconda,Inc.
If the command does not work, or returns an error, please check the documentation for help for your platform.
Now, let's install the Python libraries used for DL, specifically, TensorFlow and Keras.
TensorFlow is a Python library developed and maintained by Google. You can implement many powerful machine learning and DL architectures in custom models and applications using TensorFlow. To find out more, visit https://www.tensorflow.org/.
Install the TensorFlow DL library (for all OS except Windows) by typing the following command:
conda install -c conda-forge tensorflow
Alternatively, you may choose to install using pip and a specific version of TensorFlow for your platform, using the following command:
pip install tensorflow==1.6
You can find the installation instructions for TensorFlow at https://www.tensorflow.org/get_started/os_setup#anaconda_installation.
Now we will install keras using the following command:
pip install keras
To validate the environment and the version of the packages, let's write the following script, which will print the version numbers of each library:
# Import the tensorflow library
# Import the keras library
print('tensorflow: %s' % tensorflow.__version__)
print('keras: %s' % keras.__version__)
Save the script as dl_versions.py. Run the script by typing the following command:
You should see the following output:
Using TensorFlow backend.
Voila! Now our Python development environment is ready for us to write some awesome DL applications in our local.
All the steps we performed up to now remain the same for the cloud as well, but there are a few additional modules required to configure the cloud virtual machines to make your DL applications servable and scalable. So, before setting up your server, follow the instructions from the preceding section.
To deploy your DL applications in the cloud, you will need a server good enough to train your models and serve at the same time. With huge development in the sphere of DL, the need for cloud servers to practice and deploy projects has increased drastically, and so have the options on the market. The following is a list of some of the best options on offer:
- Paperspace (https://www.paperspace.com/)
- FloydHub (https://www.floydhub.com)
- Amazon Web Services (https://aws.amazon.com/)
- Google Cloud Platform (https://cloud.google.com/)
- DigitalOcean (https://cloud.digitalocean.com/)
All of these options have their own pro and cons, and the final choice totally depends on your use case and preferences, so feel free to explore more. In this book, we will build and deploy our models mostly on Google Compute Engine (GCE), which is a part of Google Cloud Platform (GCP). Follow the steps mentioned in this chapter to spin up a VM server and get started.
The main idea behind this book is to empower you to build and deploy DL applications. In this section, we will discuss some critical components required to make your applications accessible to millions of users.
The best way to make your application accessible is to expose it as a web service, using REST or SOAP APIs. To do so, we have many Python web frameworks to choose from, such as web.py, Flask, Bottle, and many more. These frameworks allow us to easily build web services and deploy them.
You should have a Google Cloud (https://cloud.google.com/) account. Google is promoting the usage of its platform right now, and is giving away $300 dollars of credit and 12 months as a free tier user.
Follow these instructions to set up your GCP:
Creating a new project: Click on the three dots, as shown in the following screenshot, and then click on the + sign to create a new project:
Spinning a VM instance: Click on the three lines on the upper-left corner of the screen, select the compute option, and click on Compute Engine. Now choose Create new instance. Name the VM instance, and select your zone as us-west2b. Choose the machine type size.
Choose your boot disk as Ubuntu 16.04 LTS. In firewall options, choose both the http and https option (it's important to make it accessible from the outer world). To opt for GPU options, you can click on customize button, and find the GPU options. You can choose between two NVIDIA GPUs. Check both Allow HTTP traffic and Allow HTTPS traffic.
Now click on Create. Boom! your new VM is getting ready.
Modify the firewall settings: Now click on the Firewall rules setting under Networking. Under Protocols and Ports, we need to select the port that we will use to export our APIs. We have chosen tcp:8080 as our port number. Now click on the Save button. This will assign a rule in the firewall of your VM to access the applications from the external world.
Boot your VM: Now start your VM instance. When you see the green tick, click on SSH—this will open a command window, and you are now inside the VM. You can also use gcloud cli to log in and access your VMs.
Then follow the same steps as we performed to set up the local environment, or read further to learn how to create an automation script that will perform all the setup automatically.
Now we need a web framework to write our DL applications as web services—again, there are lots of options, but to make it simple, we will be using a combination of web.py and Gunicorn.
Let's install them using following commands:
pip install web.py
pip install gunicorn
Now we are ready to deploy our DL solution as a web service, and scale it to production level.
Installing of Python packages and DL libraries can be a tedious process, requiring lots of time and repetitive effort. So, to ease the job, we will create a bash script that can be used to install everything using a single command.
The following is a list of components that will get installed and configured:
- Java 8
- Bazel for building
- Python and associated dependencies
- Dependencies for all of the aforementioned services (see the script for exact details)
You can simply download the automation script to your server or locally, execute it, and you're done. Here are the steps to follow:
- Save the script to your home directory, by cloning the code from the repository:
git clone https://github.com/PacktPublishing/Python-Deep-Learning-Projects
- Once you have the copy of the complete repository, move to the Chapter01 folder, which will contain a script file named setupDeepLearning.sh. This is the script that we will execute to start the setup process, but, before execution, we will have to make it executable using the chmod command:
chmod +x setupDeepLearning.sh
- Once this is done, we are ready to execute it as follows:
Follow any instructions that appear (basically, say yes to everything and accept Java's license). It should take about 10 to 15 minutes to install everything. Once it has finished, you will see the list of Python packages being installed, as shown in the following screenshot:
There are a couple of other options, too, such as getting Docker images from TensorFlow and other DL packages, which can set up fully functional DL machines for large-scale and production-ready environments. You can find out more about Docker at https://www.docker.com/what-docker. Also, for a quick-start guide, follow the instructions on this repository for an all-in-one Docker image for DL at https://github.com/floydhub/dl-docker.
In this chapter, we worked to get the team set up in a common environment with a standardized toolset. We are looking to deploy our project applications by utilizing Gunicorn and CUDA. Those projects will rely on highly advanced and effective DL libraries, such as TensorFlow and Keras running in Python 2.x/3.x. We'll write our code using the resources in the Anaconda package, and all of this will be running on Ubuntu 16.04 or greater.
Now we are all set to perform experiments and deploy our DL models in production!