Reader small image

You're reading from  Machine Learning for Developers

Product typeBook
Published inOct 2017
Reading LevelBeginner
PublisherPackt
ISBN-139781786469878
Edition1st Edition
Languages
Right arrow
Authors (2):
Rodolfo Bonnin
Rodolfo Bonnin
author image
Rodolfo Bonnin

Rodolfo Bonnin is a systems engineer and Ph.D. student at Universidad Tecnolgica Nacional, Argentina. He has also pursued parallel programming and image understanding postgraduate courses at Universitt Stuttgart, Germany. He has been doing research on high-performance computing since 2005 and began studying and implementing convolutional neural networks in 2008, writing a CPU- and GPU-supporting neural network feedforward stage. More recently he's been working in the field of fraud pattern detection with Neural Networks and is currently working on signal classification using machine learning techniques. He is also the author of Building Machine Learning Projects with Tensorflow and Machine Learning for Developers by Packt Publishing.
Read more about Rodolfo Bonnin

View More author details
Right arrow

Recent Models and Developments

In the previous chapters, we have explored a large number of training mechanisms for machine learning models, starting with simple pass-through mechanisms, such as the well-known feedforward neural networks. Then we looked at a more complex and reality-bound mechanism, accepting a determined sequence of inputs as the training input, with Recurrent Neural Networks (RNNs).

Now it's time to take a look at two recent players that incorporate other aspects of the real world. In the first case, we will have not only a single network optimizing its model, but also another participant, and they will both improve each other's results. This is the case of Generative Adversarial Networks (GANs).

In the second case, we will talk about a different kind of model, which will try to determine the optimal set of steps to maximize a reward: reinforcement...

GANs

GANs are a new kind of unsupervised learning model, one of the very few disrupting models of the last decade. They have two models competing with and improving each other throughout the iterations.

This architecture was originally based on supervised learning and game theory, and its main objective is to basically learn to generate realistic samples from an original dataset of elements of the same class.

It's worth noting that the amount of research on GANs is increasing at an almost exponential rate, as depicted in the following graph:

Source: The GAN Zoo (https://github.com/hindupuravinash/the-gan-zoo)

Types of GAN applications

GANs allow new applications to produce new samples from a previous set of samples...

Reinforcement learning

Reinforcement learning is a field that has resurfaced recently, and it has become more popular in the fields of control, finding the solutions to games and situational problems, where a number of steps have to be implemented to solve a problem.

A formal definition of reinforcement learning is as follows:

"Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment.” (Kaelbling et al. 1996).

In order to have a reference frame for the type of problem we want to solve, we will start by going back to a mathematical concept developed in the 1950s, called the Markov decision process.

Markov decision process

Before...

Basic RL techniques: Q-learning

One of the most well-known reinforcement learning techniques, and the one we will be implementing in our example, is Q-learning.

Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function that represents the maximum discounted future reward when we perform action a in state s.

Once we know the Q-function, the optimal action a in state s is the one with the highest Q-value. We can then define a policy π(s), that gives us the optimal action in any state, expressed as follows:

We can define the Q-function for a transition point (st, at, rt, st+1) in terms of the Q-function at the next point (st+1, at+1, rt+1, st+2), similar to what we did with the total discounted future reward. This equation is known as the Bellman equation for Q-learning...

References

  • Bellman, Richard, A Markovian decision process. Journal of Mathematics and Mechanics (1957): 679-684.
  • Kaelbling, Leslie Pack, Michael L. Littman, and Andrew W. Moore, Reinforcement learning: A survey. Journal of artificial intelligence research 4 (1996): 237-285.
  • Goodfellow, Ian, et al., Generative adversarial nets, advances in neural information processing systems, 2014
  • Radford, Alec, Luke Metz, and Soumith Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).
  • Isola, Phillip, et al., Image-to-image translation with conditional adversarial networks, arXiv preprint arXiv:1611.07004 (2016).
  • Mao, Xudong, et al., Least squares generative adversarial networks. arXiv preprint ArXiv:1611.04076 (2016).
  • Eghbal-Zadeh, Hamid, and Gerhard Widmer, Likelihood Estimation for Generative Adversarial...

Summary

In this chapter, we have reviewed two of the most important and innovative architectures that have appeared in recent. Every day, new generative and reinforcement models are applied in innovative ways, whether to generate feasible new elements from a selection of previously known classes or even to win against professional players in strategy games.

In the next chapter, we will provide precise instructions so you can use and modify the code provided to better understand the different concepts you have acquired throughout the book.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning for Developers
Published in: Oct 2017Publisher: PacktISBN-13: 9781786469878
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Rodolfo Bonnin

Rodolfo Bonnin is a systems engineer and Ph.D. student at Universidad Tecnolgica Nacional, Argentina. He has also pursued parallel programming and image understanding postgraduate courses at Universitt Stuttgart, Germany. He has been doing research on high-performance computing since 2005 and began studying and implementing convolutional neural networks in 2008, writing a CPU- and GPU-supporting neural network feedforward stage. More recently he's been working in the field of fraud pattern detection with Neural Networks and is currently working on signal classification using machine learning techniques. He is also the author of Building Machine Learning Projects with Tensorflow and Machine Learning for Developers by Packt Publishing.
Read more about Rodolfo Bonnin