In this chapter, we'll be talking about what machine learning is, why we do machine learning, what supervised learning is, and what unsupervised learning is. We will also understand the difference between classification and regression. Following this, we will start with the installation of JDK and JRE, and will also set up NetBeans on our system. Toward the end of the chapter, we will download and use a JAR file for our project.
Therefore, we will be covering the following topics in this chapter:
- What is machine learning?
- Difference between classification and regression
- Installing JDK and JRE
- Setting up the NetBeans IDE
- Importing Java libraries and exporting code in projects as a JAR file
Let's get started and see what the AI problems that are related to supervised and unsupervised learning are.
The capability of machine learning is actually the capability of adding new knowledge, or refining previous knowledge, that will help us in making the best or optimum decisions. Note the following, according to the economist and political scientist, Herbert Simon:
"Learning is any process by which a system improves performance from experience."
There is a standard definition that has been given by Tom Mitchell, who is a computer scientist and E. Fredkin University Professor at the Carnegie Mellon University (CMU), that is as follows:
"A program is said to learn from experience E with respect to some class of task T and performance measure P. If its performance at tasks in T, as measured by P, improves with experience E, then it is machine learning."
What this means is that when we have certain data and experiences available to us along with the help of a human expert, we are able to classify that particular data. For example, let's say we have some emails. With the help of a human, we can filter the emails as spam, business, marketing, and so on. This means that we are classifying our emails based on our experiences and classes of task T are the classes/filters that we have assigned to the emails.
With this data in mind, if we train our model, we can make a model that will classify emails according to our preferences. This is machine learning. We can always check whether the system has learned perfectly or not, which would be considered as a performance measure.
In this way, we will receive more data in the form of emails and we will be able to classify them, and it would be an improvement of the data. With that gained experience from the new data, the system's performance would improve.
This is the basic idea of machine learning.
The question is, why are we actually doing this?
We do this because we want to develop systems that are too difficult or expensive to construct manually – whether that's because they require specific detailed skills or knowledge tuned to a specific task. This is known as a knowledge engineering bottleneck. As humans, we don't have enough time to actually develop rules for each and every thing, so we look at data and we learn from data in order to make our systems predict things based on learning from data.
The following diagram illustrates the basic architecture of a learning system:
In the preceding diagram, we have a Teacher, we have Data, and we add Labels onto them, and we also have a Teacher who has assigned these labels. We give it to a Learner Component, which keeps it in a Knowledge Base, from which we can evaluate its performance and send it to a Performance Component. Here, we can have different evaluation measures, which we'll look at in future chapter, using which we can send Feedback to the Learner Component. This process can be improved and built upon over time.
The following diagram illustrates a basic architecture of how our supervised learning system looks:
Suppose we have some Training Data. Based on that, we can do some Preprocessing and extract features that are important. These Features will be given to a Learning Algorithm with some Labels attached that have been assigned by a human expert. This algorithm will then learn and create a Model. Once the Model has been created, we can take the new data, preprocess it, and extract features from it; based on those Features, we then send the data to a Model, which will do some kind of a Classification before providing a Decision. When we complete this process, and when we have a human who provides us with Labels, this kind of learning is known as supervised learning.
On the other hand, there is unsupervised learning, which is illustrated in the following diagram:
In unsupervised learning, we extract data and later Features before giving it to a Learning Algorithm, but there is no kind of human intervention that provides classification. In this case, the machine would group the data into smaller clusters, which is how the Model will learn. The next time features are extracted and given to a Model, the Model will provide us with four emails that belong to cluster 1, five emails that belong to cluster 3, and so on. This is known as unsupervised learning, and the algorithms that we use are known as clustering algorithms.
In our classification system, we have data that is used to train our model. In this case of sorting emails into clusters, discrete values are provided with the data, and this is known as classification.
There is another aspect of supervised learning, where instead of providing a discrete value, we provide it with a continuous value. This is known as regression. Regression is also considered supervised learning. The difference between classification and regression is that the first has discrete values and the latter has continuous, numeric values. The following diagram illustrates the three learning algorithms that we can use:
As you can see in the preceding diagram, we use Supervised Learning, Unsupervised Learning, and Reinforcement Learning. When we talk about Supervised Learning, we also use Classification. Within Classification, we perform tasks such as Identify Fraud Detection, Image Classification, Customer Retention, and Diagnostics. In Regression, we perform activities such as Advertising Popularity Prediction, Weather Forecasting, and so on. In Reinforcement, we perform Game AI, Skill Acquisition, and so on. Finally, in Unsupervised Learning, we have Recommender Systems and different sub-fields of machine learning, as illustrated.
Since we will be coding in Java, we will need the Java Development Kit (JDK). JDK is an environment that comprises a compiler and an interpreter. The compiler is used to convert source code that is written in a high-level language into an intermediate form, which is byte code. That means that the JDK compiles the entire code and converts it into byte code. Once you have byte code, you need a Java interpreter, which is known as a Java Runtime Environment (JRE). JRE provides you with just the Java interpreter. If you have a JRE and byte code, you can run it on your system, as shown in the following diagram:
We will now download JDK onto our system.
Open your browser and go to the link https://www.oracle.com/technetwork/java/javase/downloads/index.html. Here, you will get an option to download Java. Currently, JDK 8 is supported by NetBeans. We have JDK 10, but it's not supporting NetBeans. If you don't have NetBeans in JDK, go to http://www.oracle.com/technetwork/java/javase/downloads/jdk-netbeans-jsp-142931.html. You have to accept the agreement, and based on your system, you can then download NetBeans and JDK, as shown in the following screenshot:
If you only want to install JDK, you have to go to JDK 8 at http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html. It will take you to the next page where you will also find more information about JDK 8, as follows:
Now, you have to accept the agreement again and download JDK according to your system requirements.
Once you have downloaded JDK, it is easy to install. For Windows and macOS, you just have to right-click on it. For Linux machines, you can either use a sudo
or apt-get
command on Ubuntu.
We will now download NetBeans onto our system. Visit the link at https://netbeans.org/downloads/. You should see something like the following screenshot:
Here, you will find information about the current NetBeans version, which is NetBeans 8.2. You can download either Java SE
, Java EE
, or any other NetBeans IDE Download Bundle
. It is advisable that you download the All
bundle because it supports all of the technologies, as seen in the preceding screenshot. You never know when you might need them!
As shown on the top-right corner, 8.2
is the current version that you will be downloading. If you don't want to download this version, you can download its immediate predecessor, which is 8.1
. If you want to download the experimental versions, which are the alpha or beta versions, click on Development
. If you want to download versions that are earlier than 8.1
, you can go to Archive
, and this will help you in downloading the required version, as seen in the following screenshot:
As shown in the preceding diagram, 8.2
is the latest version of NetBeans. There have been changes in subsequent versions of NetBeans, but we will be using 8.2
. You can download older versions if you want. Versions such as 7.1
and 7.0.1
, for example, work in a different way but can be used with older Java code.
Once you have downloaded NetBeans, you will get an .exe
file on Windows. You just have to double-click on it and follow the instructions to install it. On a Mac, it will appear as a .dmg
file; just click on it to install it. The installation process is simple, as you simply have to follow the prompts. On Linux, you will get a .sh
file. Here, simply run the shell script and click on Next
to proceed. NetBeans should now be installed on your machine!
We will now download a JAR file from the internet and use it in our project to create a JAR file for our project.
Open a web browser and search for download a junit.jar
. This will take you to a link where you can download a JAR file. There are online repositories available where JAR files exist. One of the most reliable repositories can be found at http://www.java2s.com/Code/Jar/j/Downloadjunitjar.htm, where you can download any available JAR file. If you click on it, it should take you to the following page:
As seen in the preceding screenshot, you will find the junit.jar
file and the different classes that are available in the JAR file also listed. You can right-click on the save (floppy disc) symbol to save the file on your system.
Once the file is downloaded, extract it into a junit.jar
file. You can then add it to your project with the following steps:
- Create a new project on NetBeans, for example,
HelloWorld
. - Since the new project will not have the
junit.jar
file, go toProperties
by right-clicking on the project, as shown in the following screenshot:
- Go to the
Libraries
|Add JAR/Folder
option and provide the location of where thisjunit.jar
file is, as follows:
- Now the JAR file has been added to the project, we can use the
junit.jar
file in animport
statement.We can alsoimport
individual packages, as shown in the following screenshot:
- If you want to use all of the classes in
framework
, you just have to write the following code:
import junit.framework.*;
- Now, let's use the following code to print the output
Hello World
:
package helloworld; /** * * @author admin */ import junit.framework.*; public class HelloWorld { /** * @param args the command line arguments */ public static void main(String[] args) { // TODO code application logic here System.out.println("Hello World"); } }
- After running the preceding code, you should get an output similar to the following:
If you want to create a JAR file for this project, please perform the following steps:
- Go to
Run
and selectClean and Build Project (HelloWorld)
to build your project:
- Once building the
HelloWorld
project is complete, theOutput
window will sayBUILD SUCCESSFUL
, as shown in the following screenshot:
- Open the project folder, in our case
HelloWorld
, and go into thedist
folder, as follows:
This means that whenever you use any JAR files in your project, they will be stored in the lib
folder of your JAR file.
In this chapter, we first looked at what the difference between supervised and unsupervised learning is, before moving on to the difference between classification and regression. We then saw how to install JDK, what the difference between JDK and JRE is, and how to install a NetBeans IDE. We also created our own JAR file by importing another JAR file into our project. In the next chapter, we'll learn how to search and explore different search algorithms.