Packt+ | Advance your knowledge in tech

You're reading from Statistics for Data Science

Product type Book

Published in Nov 2017

Publisher Packt

ISBN-13 9781788290678

Pages 286 pages

Edition 1st Edition

Languages

Concepts

Statistics

Table of Contents (19) Chapters

Title Page

Credits

About the Author

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

Transitioning from Data Developer to Data Scientist

Declaring the Objectives

A Developer's Approach to Data Cleaning

Data Mining and the Database Developer

Statistical Analysis for the Database Developer

Database Progression to Database Regression

Regularization for Database Improvement

Database Development and Assessment

Databases and Neural Networks

Boosting your Database

Database Classification using Support Vector Machines

Database Structures and Machine Learning

Chapter 9. Databases and Neural Networks

In this chapter, we will look at and define Artificial Neural Network (ANN) and draw data from a data developer's knowledge of data, databases, and data models to help him or she understand the purpose and use of neural networks, and why neural networks are so significant to data science and statistics.

We have organized the information in this chapter into the following key areas:

Definition of a neural network
Relating a neural network model to a database model
Looking at R-based neural networks
Use cases

Ask any data scientist

Today, if you ask any data scientist about the statistical methods, (or even a few) you will most likely discover that there are two most well-known statistical methods used within the practice of data science and the statistics industry today for predictive modeling. We introduced these two methods in Chapter 6, Database Progression to Database Regression.

These two methods are as follows:

Linear regression
Logistic regression

The linear regression method is probably considered to be the classic or most common starting point for problems, where the goal is to predict a numerical quantity. The Linear Regression (or LR) model is based on a linear combination of input features.

The logistic regression method uses a nonlinear transformation of this linear feature combination in order to restrict the range of the output in the interval [0, 1]. In doing so, it predicts the probability that the output belongs to one of two classes. Thus, it is a very well-known technique for...

Summary

In this chapter, we defined neural networks and, from a data developer's knowledge of databases and data models, grew to understand the purpose and use of neural networks and why neural networks are so important to data science. We also looked at an R-based ANN and listed some popular use case examples.

In the next chapter, we will introduce the idea of using statistical boosting to better understand data in a database.