Reader small image

You're reading from  Deep Learning with TensorFlow

Product typeBook
Published inApr 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781786469786
Edition1st Edition
Languages
Right arrow
Authors (3):
Giancarlo Zaccone
Giancarlo Zaccone
author image
Giancarlo Zaccone

Giancarlo Zaccone has over fifteen years' experience of managing research projects in the scientific and industrial domains. He is a software and systems engineer at the European Space Agency (ESTEC), where he mainly deals with the cybersecurity of satellite navigation systems. Giancarlo holds a master's degree in physics and an advanced master's degree in scientific computing. Giancarlo has already authored the following titles, available from Packt: Python Parallel Programming Cookbook (First Edition), Getting Started with TensorFlow, Deep Learning with TensorFlow (First Edition), and Deep Learning with TensorFlow (Second Edition).
Read more about Giancarlo Zaccone

Md. Rezaul Karim
Md. Rezaul Karim
author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

Ahmed Menshawy
Ahmed Menshawy
author image
Ahmed Menshawy

Ahmed Menshawy is a Research Engineer at the Trinity College Dublin, Ireland. He has more than 5 years of working experience in the area of ML and NLP. He holds an MSc in Advanced Computer Science. He started his Career as a Teaching Assistant at the Department of Computer Science, Helwan University, Cairo, Egypt. He taught several advanced ML and NLP courses such as ML, Image Processing, and so on. He was involved in implementing the state-of-the-art system for Arabic Text to Speech. He was the main ML specialist at the Industrial research and development lab at IST Networks, based in Egypt.
Read more about Ahmed Menshawy

View More author details
Right arrow

Neural networks

Artificial Neural Networks (ANNs) are one of the main tools that take advantage of the concept of deep learning. They are an abstract representation of our nervous system, which contains a collection of neurons that communicate with each other through connections called axons. The first artificial neuron model was proposed in 1943 by McCulloch and Pitts in terms of a computational model of nervous activity. This model was followed by another, proposed by John von Neumann, Marvin Minsky, Frank Rosenblatt (the so-called perceptron), and many others.

The biological neuron

As you can see in the following figure, a biological neuron is composed of the following:

  • A cell body or soma
  • One or more dendrites, whose responsibility is to receive signals from other neurons
  • An axon, which in turn conveys the signals generated by the same neuron to the other connected neurons

This is what a biological neuron model looks like:

Figure 3: Biological neuron model

The neuron activity is in the alternation of sending the signal (active state) and rest/reception of signals from other neurons (inactive state).

The transition from one phase to another is caused by the external stimuli represented by signals that are picked up by the dendrites. Each signal has an excitatory or inhibitory effect, conceptually represented by a weight associated with the stimulus. The neuron in an idle state accumulates all the signals received until they have reached a certain activation threshold.

An artificial neuron

Similar to the biological one, the artificial neuron consists of the following:

  • One or more incoming connections, with the task of collecting numerical signals from other neurons; each connection is assigned a weight that will be used to consider each signal sent
  • One or more output connections that carry the signal for the other neurons
  • An activation function determines the numerical value of the output signal, on the basis of the signals received from the input connections with other neurons, and suitably collected from the weights associated with each picked-up signal and the activation threshold of the neuron itself

The following figure represents the artificial neuron:

Figure 4: Artificial neuron model

The output, that is, the signal whereby the neuron transmits its activity outside, is calculated by applying the activation function, also called the transfer function, to the weighted sum of the inputs. These functions have a dynamic between -1 and 1, or between 0 and 1.

There is a set of activation functions that differs in complexity and output:

  • Step function: This fixes the threshold value x (for example, x= 10). The function will return 0 or 1 if the mathematical sum of the inputs is at, above, or below the threshold value.
  • Linear combination: Instead of managing a threshold value, the weighted sum of the input values is subtracted from a default value; we will have a binary outcome, but it will be expressed by a positive (+b) or negative (-b) output of the subtraction.
  • Sigmoid: This produces a sigmoid curve, a curve having an S trend. Often, the sigmoid function refers to a special case of the logistic function.

From the simplest forms, used in the prototyping of the first artificial neurons, we then move on to more complex ones that allow greater characterization of the functioning of the neuron. The following are just a few:

  • Hyperbolic tangent function
  • Radial basis function
  • Conic section function
  • Softmax function

It should be recalled that the network, and then the weights in the activation functions, will then be trained. As the selection of the activation function is an important task in the implementation of the network architecture, studies indicate marginal differences in terms of output quality if the training phase is carried out properly.

Figure 5: Most used transfer functions

In the preceding figure, the functions are labeled as follows:

  • a: Step function
  • b: Linear function
  • c: Computed sigmoid function with values between 0 and 1
  • d: Sigmoid function with computed values between -1 and 1
Previous PageNext Page
You have been reading a chapter from
Deep Learning with TensorFlow
Published in: Apr 2017Publisher: PacktISBN-13: 9781786469786
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Giancarlo Zaccone

Giancarlo Zaccone has over fifteen years' experience of managing research projects in the scientific and industrial domains. He is a software and systems engineer at the European Space Agency (ESTEC), where he mainly deals with the cybersecurity of satellite navigation systems. Giancarlo holds a master's degree in physics and an advanced master's degree in scientific computing. Giancarlo has already authored the following titles, available from Packt: Python Parallel Programming Cookbook (First Edition), Getting Started with TensorFlow, Deep Learning with TensorFlow (First Edition), and Deep Learning with TensorFlow (Second Edition).
Read more about Giancarlo Zaccone

author image
Md. Rezaul Karim

Md. Rezaul Karim is a researcher, author, and data science enthusiast with a strong computer science background, coupled with 10 years of research and development experience in machine learning, deep learning, and data mining algorithms to solve emerging bioinformatics research problems by making them explainable. He is passionate about applied machine learning, knowledge graphs, and explainable artificial intelligence (XAI). Currently, he is working as a research scientist at Fraunhofer FIT, Germany. He is also a PhD candidate at RWTH Aachen University, Germany. Before joining FIT, he worked as a researcher at the Insight Centre for Data Analytics, Ireland. Previously, he worked as a lead software engineer at Samsung Electronics, Korea.
Read more about Md. Rezaul Karim

author image
Ahmed Menshawy

Ahmed Menshawy is a Research Engineer at the Trinity College Dublin, Ireland. He has more than 5 years of working experience in the area of ML and NLP. He holds an MSc in Advanced Computer Science. He started his Career as a Teaching Assistant at the Department of Computer Science, Helwan University, Cairo, Egypt. He taught several advanced ML and NLP courses such as ML, Image Processing, and so on. He was involved in implementing the state-of-the-art system for Arabic Text to Speech. He was the main ML specialist at the Industrial research and development lab at IST Networks, based in Egypt.
Read more about Ahmed Menshawy