You're reading from Practical Convolutional Neural Networks

Product typeBook

Published inFeb 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781788392303

Edition1st Edition

Languages

Python

Tools

Keras TensorFlow

Concepts

Deep Learning

Authors (3):

Mohit Sewak

Md. Rezaul Karim

Pradeep Pujari

View More author details

Handwritten number recognition with Keras and MNIST

A typical neural network for a digit recognizer may have 784 input pixels connected to 1,000 neurons in the hidden layer, which in turn connects to 10 output targets — one for each digit. Each layer is fully connected to the layer above. A graphical representation of this network is shown as follows, where x are the inputs, h are the hidden neurons, and y are the output class variables:

In this notebook, we will build a neural network that will recognize handwritten numbers from 0-9.

The type of neural network that we are building is used in a number of real-world applications, such as recognizing phone numbers and sorting postal mail by address. To build this network, we will use the MNIST dataset.

We will begin as shown in the following code by importing all the required modules, after which the data will be loaded, and then finally building the network:

# Import Numpy, keras and MNIST data
import numpy as np
import matplotlib.pyplot as plt

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.utils import np_utils

Retrieving training and test data

The MNIST dataset already comprises both training and test data. There are 60,000 data points of training data and 10,000 points of test data. If you do not have the data file locally at the '~/.keras/datasets/' + path, it can be downloaded at this location.

Each MNIST data point has:

An image of a handwritten digit
A corresponding label that is a number from 0-9 to help identify the image

The images will be called, and will be the input to our neural network, X; their corresponding labels are y.

We want our labels as one-hot vectors. One-hot vectors are vectors of many zeros and one. It's easiest to see this in an example. The number 0 is represented as [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], and 4 is represented as [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] as a one-hot vector.

Flattened data

We will use flattened data in this example, or a representation of MNIST images in one dimension rather than two can also be used. Thus, each 28 x 28 pixels number image will be represented as a 784 pixel 1 dimensional array.

By flattening the data, information about the 2D structure of the image is thrown; however, our data is simplified. With the help of this, all our training data can be contained in one array of shape (60,000, 784), wherein the first dimension represents the number of training images and the second depicts the number of pixels in each image. This kind of data is easy to analyze using a simple neural network, as follows:

# Retrieving the training and test data
(X_train, y_train), (X_test, y_test) = mnist.load_data()


print('X_train shape:', X_train.shape)
print('X_test shape: ', X_test.shape)
print('y_train shape:',y_train.shape)
print('y_test shape: ', y_test.shape)

Visualizing the training data

The following function will help you visualize the MNIST data. By passing in the index of a training example, the show_digit function will display that training image along with its corresponding label in the title:

# Visualize the data
import matplotlib.pyplot as plt
%matplotlib inline

#Displaying a training image by its index in the MNIST set
def display_digit(index):
    label = y_train[index].argmax(axis=0)
    image = X_train[index]
    plt.title('Training data, index: %d,  Label: %d' % (index, label))
    plt.imshow(image, cmap='gray_r')
    plt.show()
    
# Displaying the first (index 0) training image
display_digit(0)
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print("Train the matrix shape", X_train.shape)
print("Test the matrix shape", X_test.shape)


#One Hot encoding of labels.
from keras.utils.np_utils import to_categorical
print(y_train.shape)
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
print(y_train.shape)

Building the network

For this example, you'll define the following:

The input layer, which you should expect for each piece of MNIST data, as it tells the network the number of inputs
Hidden layers, as they recognize patterns in data and also connect the input layer to the output layer
The output layer, as it defines how the network learns and gives a label as the output for a given image, as follows:

# Defining the neural network
def build_model():
    model = Sequential()
    model.add(Dense(512, input_shape=(784,)))
    model.add(Activation('relu')) # An "activation" is just a non-linear function that is applied to the output
 # of the above layer. In this case, with a "rectified linear unit",
 # we perform clamping on all values below 0 to 0.
                           
    model.add(Dropout(0.2))   #With the help of Dropout helps we can protect the model from memorizing or "overfitting" the training data
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.2))
    model.add(Dense(10))
    model.add(Activation('softmax')) # This special "softmax" activation,
    #It also ensures that the output is a valid probability distribution,
    #Meaning that values obtained are all non-negative and sum up to 1.
    return model

#Building the model
model = build_model()
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

Training the network

Now that we've constructed the network, we feed it with data and train it, as follows:

# Training
model.fit(X_train, y_train, batch_size=128, nb_epoch=4, verbose=1,validation_data=(X_test, y_test))

Testing

After you're satisfied with the training output and accuracy, you can run the network on the test dataset to measure its performance!

Keep in mind to perform this only after you've completed the training and are satisfied with the results.

A good result will obtain an accuracy higher than 95%. Some simple models have been known to achieve even up to 99.7% accuracy! We can test the model, as shown here:

# Comparing the labels predicted by our model with the actual labels

score = model.evaluate(X_test, y_test, batch_size=32, verbose=1,sample_weight=None)
# Printing the result
print('Test score:', score[0])
print('Test accuracy:', score[1])

You have been reading a chapter from

Practical Convolutional Neural Networks

Published in: Feb 2018Publisher: PacktISBN-13: 9781788392303

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Mohit Sewak

Mohit is a Python programmer with a keen interest in the field of information security. He has completed his Bachelor's degree in technology in computer science from Kurukshetra University, Kurukshetra, and a Master's in engineering (2012) in computer science from Thapar University, Patiala. He is a CEH, ECSA from EC-Council USA. He has worked in IBM, Teramatrix (Startup), and Sapient. He currently doing a Ph.D. from Thapar Institute of Engineering & Technology under Dr. Maninder Singh. He has published several articles in national and international magazines. He is the author of Python Penetration Testing Essentials, Python: Penetration Testing for Developers and Learn Python in 7 days, also by Packt. For more details on the author, you can check the following user name mohitraj.cs
Read more about Mohit Sewak

Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

Pradeep Pujari

https://www.linkedin.com/in/ppujari/
Read more about Pradeep Pujari

Other recommended products

Related to this chapter

Predictive Analytics with TensorFlow

Predictive decisions are becoming a huge trend worldwide, catering to wide industry sectors by predicting which decisions are more likely to give maximum results. Data mining, statistics, and machine learning allow users to discover predictive intelligence by uncovering patterns and showing the relationship between structured and unstructured data. This book will help you build solutions that will make automated decisions. In the end, tune and build your own predictive analytics model with the help of TensorFlow.

BookNov 2017522 pages

R Deep Learning Cookbook

Deep Learning is the next big thing. It is a part of machine learning. Its favorable results in application with huge and complex data is remarkable. This book will help you to get through the problems that you face during the execution of different tasks and understand hacks in deep learning, neural networks, and advanced machine learning techniques

BookAug 2017288 pages

Neural Network Programming with Tensorflow

If you’re aware of the buzz surrounding the terms such as machine learning, artificial intelligence or deep learning, you might know what neural networks are. TensorFlow is a popular framework which can be used to implement efficient neural networks and deep learning models. This book will show you how to leverage the power of TensorFlow to train efficient neural networks. You will start with understanding the fundamentals and basic math for neural networks and why TensorFlow is a popular choice of tool for programming neural networks. During the course of the book, you will be working on real-world datasets to get a hands-on understanding of neural network programming. By the end of this book, you will have a fair understanding of how you can leverage the power of TensorFlow to train neural networks of varying complexities, without any hassle. While you are learning about various neural network implementations you will learn the underlying mathematics and linear algebra and how it maps to the appropriate TensorFlow constructs.

BookNov 2017274 pages

Generative Adversarial Networks Projects

In this book, we will use different complexities of datasets in order to build end-to-end projects. With every chapter, the level of complexity and operations will become advanced. It consists of 8 full-fledged projects covering approaches such as 3D-GAN, Age-cGAN, DCGAN, SRGAN, StackGAN, and CycleGAN with real-world use cases.

BookJan 2019316 pages

Deep Learning By Example

Deep Learning is a subset of Machine Learning and has gained a lot of popularity recently. This book introduces you to the fundamentals of deep learning in a hands-on manner. You will use Tensorflow to train different types of neural networks for tasks related to computer vision, language processing, and other real-world problems.

BookFeb 2018450 pages

Java Deep Learning Projects

You will build full-fledged, deep learning applications with Java and different open-source libraries. Master numerical computing, deep learning, and the latest Java programming features to carry out complex advanced tasks. This book is filled with best practices/tips after every project to help you optimize your deep learning models with ease.

BookJun 2018436 pages

Hands-On Java Deep Learning for Computer Vision

This book will take you through the process of efficiently training deep neural networks in Java for Computer Vision-related tasks. You will build real-world applications ranging from simple Java handwritten digit recognition models to real-time autonomous car driving systems and face recognition models using the popular Java-based libraries.

BookFeb 2019260 pages

Deep Learning from the Basics

Deep Learning from the Basics provides a step-by-step guide to these complex algorithms, from the ground up. From the basics of machine learning to multilayer neural networks, this book presents a thorough overview of these technologies enabling you to implement your own programs right away.

BookMar 2021316 pages5

Practical Computer Vision

Computer Vision is a broadly used term associated with acquiring, processing, and analyzing images. This book will show you how you can perform various Computer Vision techniques in the most practical way possible. Right from capturing images from various sources, you will learn how to perform image filtering/manipulation and detect features in your images. As you go through the chapters, you'll work with increasingly complex algorithms to develop complex Computer Vision applications

BookFeb 2018234 pages

Hands-On Computer Vision with TensorFlow 2

Computer vision is achieving a new frontier of capabilities in fields like health, automobile or robotics. This book explores TensorFlow 2, Google's open-source AI framework, and teaches how to leverage deep neural networks for visual tasks. It will help you acquire the insight and skills to be a part of the exciting advances in computer vision.

BookMay 2019372 pages

Hands-On Mathematics for Deep Learning

The main aim of this book is to make the advanced mathematical background accessible to someone with a programming background. This book will equip the readers with not only deep learning architectures but the mathematics behind them. With this book, you will understand the relevant mathematics that goes behind building deep learning models.

BookJun 2020364 pages

Mastering Computer Vision with TensorFlow 2.x

You will learn the principles of computer vision and deep learning, and understand various models and architectures with their pros and cons. You will learn how to use TensorFlow 2.x to build your own neural network model and apply it to various computer vision tasks such as image acquiring, processing, and analyzing.

BookMay 2020430 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages