Reader small image

You're reading from  Java Deep Learning Essentials

Product typeBook
Published inMay 2016
Reading LevelIntermediate
PublisherPackt
ISBN-139781785282195
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Yusuke Sugomori
Yusuke Sugomori
author image
Yusuke Sugomori

Yusuke Sugomori is a creative technologist with a background in information engineering. When he was a graduate school student, he cofounded Gunosy with his colleagues, which uses machine learning and web-based data mining to determine individual users' respective interests and provides an optimized selection of daily news items based on those interests. This algorithm-based app has gained a lot of attention since its release and now has more than 10 million users. The company has been listed on the Tokyo Stock Exchange since April 28, 2015. In 2013, Sugomori joined Dentsu, the largest advertising company in Japan based on nonconsolidated gross profit in 2014, where he carried out a wide variety of digital advertising, smartphone app development, and big data analysis. He was also featured as one of eight "new generation" creators by the Japanese magazine Web Designing. In April 2016, he joined a medical start-up as cofounder and CTO.
Read more about Yusuke Sugomori

Right arrow

Chapter 2. Algorithms for Machine Learning – Preparing for Deep Learning

In the previous chapter, you read through how deep learning has been developed by looking back through the history of AI. As you should have noticed, machine learning and deep learning are inseparable. Indeed, you learned that deep learning is the developed method of machine learning algorithms.

In this chapter, as a pre-exercise to understand deep learning well, you will see the mode details of machine learning, and in particular, you will learn the actual code for the method of machine learning, which is closely related to deep learning.

In this chapter, we will cover the following topics:

  • The core concepts of machine learning

  • An overview of popular machine learning algorithms, especially focusing on neural networks

  • Theories and implementations of machine learning algorithms related to deep learning: perceptrons, logistic regression, and multi-layer perceptrons

Getting started


We will insert the source code of machine learning and deep learning with Java from this chapter. The version of JDK used in the code is 1.8, hence Java versions greater than 8 are required. Also, IntelliJ IDEA 14.1 is used for the IDE. We will use the external library from Chapter 5, Exploring Java Deep Learning Libraries – DL4J, ND4J, and More, so we are starting with a new Maven project.

The root package name of the code used in this book is DLWJ, the initials of Deep Learning with Java, and we will add a new package or a class under DLWJ as required. Please refer to the screenshot below, which shows the screen immediately after the new project is made:

There will be some names of variables and methods in the code that don't follow the Java coding standard. This is to improve your understanding together with some characters in the formulas to increase readability. Please bear this in mind in advance.

The need for training in machine learning


You have already seen that machine learning is a method of pattern recognition. Machine learning reaches an answer by recognizing and sorting out patterns from the given learning data. It may seem easy when you just look at the sentence, but the fact is that it takes quite a long time for machine learning to sort out unknown data, in other words, to build the appropriate model. Why is that? Is it that difficult to just sort out? Does it even bother to have a "learning" phase in between?

The answer is, of course, yes. It is extremely difficult to sort out data appropriately. The more complicated a problem becomes, the more it becomes impossible to perfectly classify data. This is because there are almost infinite patterns of categorization when you simply say "pattern classifier." Let's look at a very simple example in the following graph:

There are two types of data, circles and triangles, and the unknown data, the square. You don't know which group...

Supervised and unsupervised learning


In the previous section, we saw that there could be millions of boundaries even for a simple classification problem, but it is difficult to say which one of them is the most appropriate. This is because, even if we could properly sort out patterns in the known data, it doesn't mean that unknown data can also be classified in the same pattern. However, you can increase the percentage of correct pattern categorization. Each method of machine learning sets a standard to perform a better pattern classifier and decides the most possible boundary—the decision boundary—to increase the percentage. These standards are, of course, greatly varied in each method. In this section, we'll see what all the approaches we can take are.

First, machine learning can be broadly classified into supervised learning and unsupervised learning. The difference between these two categories is the dataset for machine learning is labeled data or unlabeled data. With supervised learning...

Machine learning application flow


We have looked at the methods that machine learning has and how these methods recognize patterns. In this section, we'll see which flow is taken, or has to be taken, by data mining using machine learning. A decision boundary is set based on the model parameters in each of the machine learning methods, but we can't say that adjusting the model parameters is the only thing we have to care about. There is another troublesome problem, and it is actually the weakest point of machine learning: feature engineering. Deciding which features are to be created from raw data, that is, the analysis subject, is a necessary step in making an appropriate classifier. And doing this, which is the same as adjusting the model parameters, also requires a massive amount of trial and error. In some cases, feature engineering requires far more effort than deciding a parameter.

Thus, when we simply say "machine learning," there are certain tasks that need to be completed in advance...

Theories and algorithms of neural networks


In the previous section, you saw the general flow of when we perform data analysis with machine learning. In this section, theories and algorithms of neural networks, one of the methods of machine learning, are introduced as a preparation toward deep learning.

Although we simply say "neural networks", their history is long. The first published algorithm of neural networks was called perceptron, and the paper released in 1957 by Frank Rosenblatt was named The Perceptron: A Perceiving and Recognizing Automaton (Project Para). From then on, many methods were researched, developed, and released, and now neural networks are one of the elements of deep learning. Although we simply say "neural networks," there are various types and we'll look at the representative methods in order now.

Perceptrons (single-layer neural networks)

The perceptron algorithm is the model that has the simplest structure in the algorithms of neural networks and it can perform linear...

Summary


In this chapter, as preparation for deep learning, we dug into neural networks, which are one of the algorithms of machine learning. You learned about three representative algorithms of single-layer neural networks: perceptrons, logistic regression, and multi-class logistic regression. We see that single-layer neural networks can't solve nonlinear problems, but this problem can be solved with multi-layer neural networks—the networks with a hidden layer(s) between the input layer and output layer. An intuitive understanding of why MLPs can solve nonlinear problems says that the networks can learn more complicated logical operations by adding layers and increasing the number of units, and thus having the ability to express more complicated functions. The key to letting the model have this ability is the backpropagation algorithm. By backpropagating the error of the output to the whole network, the model is updated and adjusted to fit in the training data with each iteration, and finally...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Java Deep Learning Essentials
Published in: May 2016Publisher: PacktISBN-13: 9781785282195
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yusuke Sugomori

Yusuke Sugomori is a creative technologist with a background in information engineering. When he was a graduate school student, he cofounded Gunosy with his colleagues, which uses machine learning and web-based data mining to determine individual users' respective interests and provides an optimized selection of daily news items based on those interests. This algorithm-based app has gained a lot of attention since its release and now has more than 10 million users. The company has been listed on the Tokyo Stock Exchange since April 28, 2015. In 2013, Sugomori joined Dentsu, the largest advertising company in Japan based on nonconsolidated gross profit in 2014, where he carried out a wide variety of digital advertising, smartphone app development, and big data analysis. He was also featured as one of eight "new generation" creators by the Japanese magazine Web Designing. In April 2016, he joined a medical start-up as cofounder and CTO.
Read more about Yusuke Sugomori