Java for Data Science

Examine the techniques and Java tools supporting the growing field of data science

Java for Data Science

This ebook is included in a Mapt subscription
Richard M. Reese, Jennifer L. Reese

Examine the techniques and Java tools supporting the growing field of data science
$10.00
$49.99
RRP $39.99
RRP $49.99
eBook
Print + eBook
Preview in Mapt

Book Details

ISBN 139781785280115
Paperback386 pages

Book Description

Data science is concerned with extracting knowledge and insights from a wide variety of data sources to analyse patterns or predict future behaviour. It draws from a wide array of disciplines including statistics, computer science, mathematics, machine learning, and data mining. In this book, we cover the important data science concepts and how they are supported by Java, as well as the often statistically challenging techniques, to provide you with an understanding of their purpose and application.

The book starts with an introduction of data science, followed by the basic data science tasks of data collection, data cleaning, data analysis, and data visualization. This is followed by a discussion of statistical techniques and more advanced topics including machine learning, neural networks, and deep learning. The next section examines the major categories of data analysis including text, visual, and audio data, followed by a discussion of resources that support parallel implementation.

The final chapter illustrates an in-depth data science problem and provides a comprehensive, Java-based solution. Due to the nature of the topic, simple examples of techniques are presented early followed by a more detailed treatment later in the book. This permits a more natural introduction to the techniques and concepts presented in the book.

Table of Contents

Chapter 1: Getting Started with Data Science
Problems solved using data science
Understanding the data science problem -  solving approach
Acquiring data for an application
The importance and process of cleaning data
Visualizing data to enhance understanding
The use of statistical methods in data science
Machine learning applied to data science
Using neural networks in data science
Deep learning approaches
Performing text analysis
Visual and audio analysis
Improving application performance using parallel techniques
Assembling the pieces
Summary
Chapter 2: Data Acquisition
Understanding the data formats used in data science applications
Data acquisition techniques
Summary
Chapter 3: Data Cleaning
Handling data formats
The nitty gritty of cleaning text
Cleaning images
Summary
Chapter 4: Data Visualization
Understanding plots and graphs
Creating index charts
Creating bar charts
Creating stacked graphs
Creating pie charts
Creating scatter charts
Creating histograms
Creating donut charts
Creating bubble charts
Summary
Chapter 5: Statistical Data Analysis Techniques
Working with mean, mode, and median
Standard deviation
Sample size determination
Hypothesis testing
Regression analysis
Summary
Chapter 6: Machine Learning
Supervised learning techniques
Unsupervised machine learning
Reinforcement learning
Summary
Chapter 7: Neural Networks
Training a neural network
Understanding static neural networks
Understanding dynamic neural networks
Additional network architectures and algorithms
Summary
Chapter 8: Deep Learning
Deeplearning4j architecture
Deep learning and regression analysis
Restricted Boltzmann Machines
Deep autoencoders
Convolutional networks
Recurrent Neural Networks
Summary
Chapter 9: Text Analysis
Implementing named entity recognition
Classifying text
Understanding tagging and POS
Extracting relationships from sentences
Sentiment analysis
Summary
Chapter 10: Visual and Audio Analysis
Text-to-speech
Understanding speech recognition
Extracting text from an image
Identifying faces
Classifying visual data
Summary
Chapter 11: Mathematical and Parallel Techniques for Data Analysis
Implementing basic matrix operations
Using map-reduce
Various mathematical libraries
Using OpenCL
Using Aparapi
Using Java 8 streams
Summary
Chapter 12: Bringing It All Together
Defining the purpose and scope of our application
Understanding the application's architecture
Data acquisition using Twitter
Understanding the TweetHandler class
Other optional enhancements
Summary

What You Will Learn

  • Understand the nature and key concepts used in the field of data science
  • Grasp how data is collected, cleaned, and processed
  • Become comfortable with key data analysis techniques
  • See specialized analysis techniques centered on machine learning
  • Master the effective visualization of your data
  • Work with the Java APIs and techniques used to perform data analysis

Authors

Table of Contents

Chapter 1: Getting Started with Data Science
Problems solved using data science
Understanding the data science problem -  solving approach
Acquiring data for an application
The importance and process of cleaning data
Visualizing data to enhance understanding
The use of statistical methods in data science
Machine learning applied to data science
Using neural networks in data science
Deep learning approaches
Performing text analysis
Visual and audio analysis
Improving application performance using parallel techniques
Assembling the pieces
Summary
Chapter 2: Data Acquisition
Understanding the data formats used in data science applications
Data acquisition techniques
Summary
Chapter 3: Data Cleaning
Handling data formats
The nitty gritty of cleaning text
Cleaning images
Summary
Chapter 4: Data Visualization
Understanding plots and graphs
Creating index charts
Creating bar charts
Creating stacked graphs
Creating pie charts
Creating scatter charts
Creating histograms
Creating donut charts
Creating bubble charts
Summary
Chapter 5: Statistical Data Analysis Techniques
Working with mean, mode, and median
Standard deviation
Sample size determination
Hypothesis testing
Regression analysis
Summary
Chapter 6: Machine Learning
Supervised learning techniques
Unsupervised machine learning
Reinforcement learning
Summary
Chapter 7: Neural Networks
Training a neural network
Understanding static neural networks
Understanding dynamic neural networks
Additional network architectures and algorithms
Summary
Chapter 8: Deep Learning
Deeplearning4j architecture
Deep learning and regression analysis
Restricted Boltzmann Machines
Deep autoencoders
Convolutional networks
Recurrent Neural Networks
Summary
Chapter 9: Text Analysis
Implementing named entity recognition
Classifying text
Understanding tagging and POS
Extracting relationships from sentences
Sentiment analysis
Summary
Chapter 10: Visual and Audio Analysis
Text-to-speech
Understanding speech recognition
Extracting text from an image
Identifying faces
Classifying visual data
Summary
Chapter 11: Mathematical and Parallel Techniques for Data Analysis
Implementing basic matrix operations
Using map-reduce
Various mathematical libraries
Using OpenCL
Using Aparapi
Using Java 8 streams
Summary
Chapter 12: Bringing It All Together
Defining the purpose and scope of our application
Understanding the application's architecture
Data acquisition using Twitter
Understanding the TweetHandler class
Other optional enhancements
Summary

Book Details

ISBN 139781785280115
Paperback386 pages
Read More

Read More Reviews