Packt+ | Advance your knowledge in tech

You're reading from Learning Bayesian Models with R

Product typeBook

Published inOct 2015

Reading LevelBeginner

PublisherPackt

ISBN-139781783987603

Edition1st Edition

Languages

Concepts

Machine Learning

Author (1)

Hari Manassery Koduvely

Chapter 8. Bayesian Neural Networks

As the name suggests, artificial neural networks are statistical models built taking inspirations from the architecture and cognitive capabilities of biological brains. Neural network models typically have a layered architecture consisting of a large number of neurons in each layer, and neurons between different layers are connected. The first layer is called input layer, the last layer is called output layer, and the rest of the layers in the middle are called hidden layers. Each neuron has a state that is determined by a nonlinear function of the state of all neurons connected to it. Each connection has a weight that is determined from the training data containing a set of input and output pairs. This kind of layered architecture of neurons and their connections is present in the neocortex region of human brain and is considered to be responsible for higher functions such as sensory perception and language understanding.

The first computational model...

Two-layer neural networks

Let us look at the formal definition of a two-layer neural network. We follow the notations and description used by David MacKay (reference 1, 2, and 3 in the References section of this chapter). The input to the NN is given by . The input values are first multiplied by a set of weights to produce a weighted linear combination and then transformed using a nonlinear function to produce values of the state of neurons in the hidden layer:

A similar operation is done at the second layer to produce final output values :

The function is usually taken as either a sigmoid function or . Another common function used for multiclass classification is softmax defined as follows:

This is a normalized exponential function.

All these are highly nonlinear functions exhibiting the property that the output value has a sharp increase as a function of the input. This nonlinear property gives neural networks more computational flexibility than standard linear or generalized linear models...

Bayesian treatment of neural networks

To set the neural network learning in a Bayesian context, consider the error function for the regression case. It can be treated as a Gaussian noise term for observing the given dataset conditioned on the weights w. This is precisely the likelihood function that can be written as follows:

Here, is the variance of the noise term given by and represents a probabilistic model. The regularization term can be considered as the log of the prior probability distribution over the parameters:

Here, is the variance of the prior distribution of weights. It can be easily shown using Bayes' theorem that the objective function M(w) then corresponds to the posterior distribution of parameters w:

In the neural network case, we are interested in the local maxima of . The posterior is then approximated as a Gaussian around each maxima , as follows:

Here, A is a matrix of the second derivative of M(w) with respect to w and represents an inverse of the covariance matrix...

The brnn R package

The brnn package was developed by Paulino Perez Rodriguez and Daniel Gianola, and it implements the two-layer Bayesian regularized neural network described in the previous section. The main function in the package is brnn( ) that can be called using the following command:

>brnn(x,y,neurons,normalize,epochs,…,Monte_Carlo,…)

Here, x is an n x p matrix where n is the number of data points and p is the number of variables; y is an n dimensional vector containing target values. The number of neurons in the hidden layer of the network can be specified by the variable neurons. If the indicator function normalize is TRUE, it will normalize the input and output, which is the default option. The maximum number of iterations during model training is specified using epochs. If the indicator binary variable Monte_Carlo is true, then an MCMC method is used to estimate the trace of the inverse of the Hessian matrix A.

Let us try an example with the Auto MPG dataset that we used in Chapter...

Deep belief networks and deep learning

Some of the pioneering advancements in neural networks research in the last decade have opened up a new frontier in machine learning that is generally called by the name deep learning (references 5 and 7 in the References section of this chapter). The general definition of deep learning is, a class of machine learning techniques, where many layers of information processing stages in hierarchical supervised architectures are exploited for unsupervised feature learning and for pattern analysis/classification. The essence of deep learning is to compute hierarchical features or representations of the observational data, where the higher-level features or factors are defined from lower-level ones (reference 8 in the References section of this chapter). Although there are many similar definitions and architectures for deep learning, two common elements in all of them are: multiple layers of nonlinear information processing and supervised or unsupervised learning...

Exercises

For the Auto MPG dataset, compare the performance of predictive models using ordinary regression, Bayesian GLM, and Bayesian neural networks.

References

MacKay D. J. C. Information Theory, Inference and Learning Algorithms. Cambridge University Press. 2003. ISBN-10: 0521642981
MacKayD. J. C. "The Evidence Framework Applied to Classification Networks". Neural Computation. Volume 4(3), 698-714. 1992
MacKay D. J. C. "Probable Networks and Plausible Predictions – a review of practical Bayesian methods for supervised neural networks". Network: Computation in neural systems
Hinton G. E., Rumelhart D. E., and Williams R. J. "Learning Representations by Back Propagating Errors". Nature. Volume 323, 533-536. 1986
MacKay D. J. C. "Bayesian Interpolation". Neural Computation. Volume 4(3), 415-447. 1992
Hinton G. E., Krizhevsky A., and Sutskever I. "ImageNet Classification with Deep Convolutional Neural Networks". Advances In Neural Information Processing Systems (NIPS). 2012
Hinton G., Osindero S., and Teh Y. "A Fast Learning Algorithm for Deep Belief Nets". Neural Computation. 18:1527–1554. 2006
Hinton G. and Salakhutdinov R. "Reducing the Dimensionality...

Summary

In this chapter, we learned about an important class of machine learning model, namely neural networks, and their Bayesian implementation. These models are inspired by the architecture of the human brain and they continue to be an area of active research and development. We also learned one of the latest advances in neural networks that is called deep learning. It can be used to solve many problems such as computer vision and natural language processing that involves highly cognitive elements. The artificial intelligent systems using deep learning were able to achieve accuracies comparable to human intelligence in tasks such as speech recognition and image classification. With this chapter, we have covered important classes of Bayesian machine learning models. In the next chapter, we will look at a different aspect: large scale machine learning and some of its applications in Bayesian models.

The rest of the chapter is locked

You have been reading a chapter from

Learning Bayesian Models with R

Published in: Oct 2015Publisher: PacktISBN-13: 9781783987603

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Hari Manassery Koduvely

Dr. Hari M. Koduvely is an experienced data scientist working at the Samsung R&D Institute in Bangalore, India. He has a PhD in statistical physics from the Tata Institute of Fundamental Research, Mumbai, India, and post-doctoral experience from the Weizmann Institute, Israel, and Georgia Tech, USA. Prior to joining Samsung, the author has worked for Amazon and Infosys Technologies, developing machine learning-based applications for their products and platforms. He also has several publications on Bayesian inference and its applications in areas such as recommendation systems and predictive health monitoring. His current interest is in developing large-scale machine learning methods, particularly for natural language understanding.
Read more about Hari Manassery Koduvely

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages