Reader small image

You're reading from  Machine Learning for Developers

Product typeBook
Published inOct 2017
Reading LevelBeginner
PublisherPackt
ISBN-139781786469878
Edition1st Edition
Languages
Right arrow
Authors (2):
Rodolfo Bonnin
Rodolfo Bonnin
author image
Rodolfo Bonnin

Rodolfo Bonnin is a systems engineer and Ph.D. student at Universidad Tecnolgica Nacional, Argentina. He has also pursued parallel programming and image understanding postgraduate courses at Universitt Stuttgart, Germany. He has been doing research on high-performance computing since 2005 and began studying and implementing convolutional neural networks in 2008, writing a CPU- and GPU-supporting neural network feedforward stage. More recently he's been working in the field of fraud pattern detection with Neural Networks and is currently working on signal classification using machine learning techniques. He is also the author of Building Machine Learning Projects with Tensorflow and Machine Learning for Developers by Packt Publishing.
Read more about Rodolfo Bonnin

View More author details
Right arrow

Recurrent Neural Networks

After we reviewed the recent developments in deep learning, we are now reaching the cutting-edge of machine learning, and we are now adding a very special dimension to our model (time, and hence sequences of inputs) through a recent series of algorithms called recurrent neural networks (RNNs).

Solving problems with order — RNNs

In the previous chapters, we have examined a number of models, from simple ones to more sophisticated ones, with some common properties:

  • They accept unique and isolated input
  • They have unique and fixed size output
  • The outputs will depend exclusively on the current input characteristics, without dependency on past or previous input

In real life, the pieces of information that the brain processes have an inherent structure and order, and the organization and sequence of every phenomenon we perceive has an influence on how we treat them. Examples of this include speech comprehension (the order of the words in a sentence), video sequence (the order of the frames in a video), and language translation. This prompted the creation of new models. The most important ones are grouped under the RNN umbrella.

...

LSTM

LSTMs are a fundamental step in RNNs, because they introduce long-term dependencies into the cells. The unrolled cells contain two different parameter lines: one long-term status, and the other representing short-term memory.

Between steps, the long-term forgets less important information, and adds filtered information from short-term events, incorporating them into the future.

LSTMs are really versatile in their possible applications, and they are the most commonly employed recurrent models, along with GRUs, which we will explain later. Let's try to break down an LSTM into its components to get a better understanding of how they work.

The gate and multiplier operation

LSTMs have two fundamental values: remembering...

Univariate time series prediction with energy consumption data

In this example, we will be solving a problem in the domain of regression. For this reason, we will build a multi-layer RNN with two LSTMs. The type of regression we will do is of the many to one type, because the network will receive a sequence of energy consumption values and will try to output the next value based on the previous four registers.

The dataset we will be working on is a compendium of many measurements of the power consumption of one home over a period of time. As we might infer, this kind of behavior can easily follow patterns (it increases when the occupants use the microwave to prepare breakfast and use computers during the day, it decreases a bit in the afternoon, and then increases in the evening with all the lights, finally decreasing to zero when the occupants are asleep).

Let's start by...

Summary

In this chapter, our scope has expanded even more, adding the important dimension of time to the set of elements to be included in our generalization. Also, we learned how to solve a practical problem with RNNs, based on real data.

But if you think you have covered all the possible options, there are many more model types to see!

In the next chapter, we will talk about cutting edge architectures that can be trained to produce very clever elements, for example, transfer the style of famous painters to a picture, and even play video games! Keep reading for reinforcement learning and generative adversarial networks.

References

  • Hopfield, John J, Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences 79.8 (1982): 2554-2558.
  • Bengio, Yoshua, Patrice Simard, and Paolo Frasconi, Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 5.2 (1994): 157-166.
  • Hochreiter, Sepp, and Jürgen Schmidhuber, long short-term memory. Neural Computation 9.8 (1997): 1735-1780.
  • Hochreiter, Sepp. Recurrent neural net learning and vanishing gradient. International Journal Of Uncertainity, Fuzziness and Knowledge-Based Systems 6.2 (1998): 107-116.
  • Sutskever, Ilya, Training recurrent neural networks. University of Toronto, Toronto, Ont., Canada (2013).
  • Chung, Junyoung, et al, Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning for Developers
Published in: Oct 2017Publisher: PacktISBN-13: 9781786469878
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Rodolfo Bonnin

Rodolfo Bonnin is a systems engineer and Ph.D. student at Universidad Tecnolgica Nacional, Argentina. He has also pursued parallel programming and image understanding postgraduate courses at Universitt Stuttgart, Germany. He has been doing research on high-performance computing since 2005 and began studying and implementing convolutional neural networks in 2008, writing a CPU- and GPU-supporting neural network feedforward stage. More recently he's been working in the field of fraud pattern detection with Neural Networks and is currently working on signal classification using machine learning techniques. He is also the author of Building Machine Learning Projects with Tensorflow and Machine Learning for Developers by Packt Publishing.
Read more about Rodolfo Bonnin