Reader small image

You're reading from  R Machine Learning Projects

Product typeBook
Published inJan 2019
Reading LevelExpert
PublisherPackt
ISBN-139781789807943
Edition1st Edition
Languages
Right arrow
Author (1)
Dr. Sunil Kumar Chinnamgari
Dr. Sunil Kumar Chinnamgari
author image
Dr. Sunil Kumar Chinnamgari

Dr. Sunil Kumar Chinnamgari has a Ph.D. in computer science and specializes in machine learning and natural language processing. He is an AI researcher with more than 14 years of industry experience. Currently, he works in the capacity of lead data scientist with a US financial giant. He has published several research papers in Scopus and IEEE journals and is a frequent speaker at various meetups. He is an avid coder and has won multiple hackathons. In his spare time, Sunil likes to teach, travel, and spend time with family.
Read more about Dr. Sunil Kumar Chinnamgari

Right arrow

Winning the Casino Slot Machines with Reinforcement Learning

If you have been following machine learning (ML) news, I am sure you will have encountered this kind of headline: computers performing better than world champions in various games. If you haven't, the following are sample news snippets from my quick Google search that are worth spending time reading to understand the situation:

Reinforcement learning (RL) is a subarea of artificial intelligence (AI) that powers computer systems who are able to demonstrate better performance in games such as Atari Breakout and Go than human players.

In this chapter, we will look at the following topics:

  • The concept of RL
  • The multi-arm bandit...

Understanding RL

RL is a very important area but is sometimes overlooked by practitioners for solving complex, real-world problems. It is unfortunate that even most ML textbooks focus only on supervised and unsupervised learning while totally ignorning RL.

RL as an area has picked up momentum in recent years; however, its origins date back to 1980. It was invented by Rich Sutton and Andrew Barto, Rich's PhD thesis advisor. It was thought of as archaic, even back in the 1980s. Rich, however, believed in RL and its promise, maintaining that it would eventually be recognized.

A quick Google search with the term RL shows that RL methods are often used in games, such as checkers and chess. Gaming problems are problems that require taking actions over time to find a long-term optimal solution to a dynamic problem. They are dynamic in the sense that the conditions are constantly...

Multi-arm bandit – real-world use cases

We encounter so many situations in the real world that are similar to that of the MABP we reviewed in this chapter. We could apply RL strategies to all these situations. The following are some of the real-world use cases similar to that of the MABP:

  • Finding the best medicine/s among many alternatives
  • Identifying the best product to launch among possible products
  • Deciding the amount of traffic (users) that we need to allocate for each website
  • Identifying the best marketing strategy for launching a product
  • Identifying the best stocks portfolio to maximize profit
  • Finding out the best stock to invest in
  • Figuring out the shortest path in a given map
  • Click-through rate prediction for ads and articles
  • Predicting the most beneficial content to be cached at a router based upon the content of articles
  • Allocation of funding for different departments...

Solving the MABP with UCB and Thompson sampling algorithms

In this project, we will use upper confidence limits and Thompson sampling algorithms to solve the MABP. We will compare their performance and strategy in three different situations—standard rewards, standard but more volatile rewards, and somewhat chaotic rewards. Let's prepare the simulation data, and once the data is prepared, we will view the simulated data using the following code:

# loading the required packages
library(ggplot2)
library(reshape2)
# distribution of arms or actions having normally distributed
# rewards with small variance
# The data represents a standard, ideal situation i.e.
# normally distributed rewards, well seperated from each other.
mean_reward = c(5, 7.5, 10, 12.5, 15, 17.5, 20, 22.5, 25, 26)
reward_dist = c(function(n) rnorm(n = n, mean = mean_reward[1], sd = 2.5),
function...

Summary

In this chapter, we learned about RL. We started the chapter by defining RL and its difference when compared with other ML techniques. We then reviewed the details of the MABP and looked at the various strategies that can be used to solve this problem. Use cases that are similar to the MABP were discussed. Finally, a project was implemented with UCB and Thompson sampling algorithms to solve the MABP using three different simulated datasets.

We have almost reached the end of this book. The appendix of this book, The Road Ahead, as the name reflects, is a guidance chapter suggesting details on what's next from here to become a better R data scientist. I am super excited that I am at the last leg of this R projects journey. Are you with me on this as well?

lock icon
The rest of the chapter is locked
You have been reading a chapter from
R Machine Learning Projects
Published in: Jan 2019Publisher: PacktISBN-13: 9781789807943
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Dr. Sunil Kumar Chinnamgari

Dr. Sunil Kumar Chinnamgari has a Ph.D. in computer science and specializes in machine learning and natural language processing. He is an AI researcher with more than 14 years of industry experience. Currently, he works in the capacity of lead data scientist with a US financial giant. He has published several research papers in Scopus and IEEE journals and is a frequent speaker at various meetups. He is an avid coder and has won multiple hackathons. In his spare time, Sunil likes to teach, travel, and spend time with family.
Read more about Dr. Sunil Kumar Chinnamgari