You're reading from Building Statistical Models in Python

Product typeBook

Published inAug 2023

Reading LevelIntermediate

PublisherPackt

ISBN-139781804614280

Edition1st Edition

Languages

Python

Concepts

Statistics

Authors (3):

Huy Hoang Nguyen

Paul N Adams

Stuart J Miller

View More author details

Discriminant Analysis

In the previous chapter, we discussed discrete regression models, including classification using logistic regression. In this chapter, we will begin with an overview of probability, expanding into conditional and independent probability. We then discuss how these two approaches to understanding the laws of probability form the basis for Bayes’ Theorem, which is used directly to expand an approach called Bayesian statistics. Following this topic, we dive into Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA), two powerful classifiers that model data using the Bayesian approach to probability modeling.

In this chapter we’re going to cover the following main topics:

Bayes’ Theorem
LDA
QDA

Bayes’ theorem

In this section, we will discuss Bayes’ Theorem, which is used in the classification models described later in this chapter. We will start the chapter by discussing the basics of probability. Then, we will take a look at dependent events and discuss how Bayes’ Theorem is related to dependent events.

Probability

Probability is a measurement of the likelihood that an event occurs or a certain outcome occurs. Generally, we can group events into two types of events: independent events and dependent events. The distinction between the types of events is in the name. An independent event is an event that is not affected or influenced by the occurrences of other events, while a dependent event is affected or influenced by the occurrences of other events.

Let’s think about some examples of these events. For the first example, think about a fair coin toss. A coin toss can result in one of two states: heads and tails. If the coin is fair, there...

Linear Discriminant Analysis

In the previous chapter, we discussed logistic regression as a classification model leveraging linear regression to model directly the probability of a target distribution given an input distribution. One alternative to this approach is LDA. LDA models the probability of target distribution class memberships given input variable distributions corresponding to each class using decision boundaries constructed using Bayes’ Theorem, which we discussed previously. Where we have k classes, using Bayes’ Theorem, we have the probability density function for LDA class membership simply as P(Y = k|X = x) for any discrete random variable, X. This relies on the posterior probability that an observation x in variable X belongs to the kth class.

Before proceeding, we must first make note that LDA makes three pertinent assumptions:

Each input variable is normally distributed.
Across all target classes, there is equal covariance among the predictors...

Quadratic Discriminant Analysis

In the last section, we discussed LDA. The data within each class needs to be drawn from a multivariate Gaussian distribution, and the covariance matrix is the same across different classes. In this section, we consider another type of discriminant analysis called QDA but the assumptions for QDA can be relaxed on the covariance matrix assumption. Here, we do not need the covariance matrix to be identical across different classes but only for each class to have its own covariance matrix. The multivariate Gaussian distribution with a class-specific mean vector within each class for observations is still required to conduct QDA. We assume that an observation from a k th class satisfies the following formula:

X~N(μ k, Σ k)

We’ll thus consider a generative classifier, as follows:

p(X | y = k, θ) = N(X | μ k, Σ k)

And then, its corresponding class posterior is this:

p(y = k | X, ...

Summary

In this chapter, we began with an overview of probability. We covered the differences between conditional and independent probability and how Bayes’ Theorem leverages these concepts to provide a unique approach to probability modeling. Next, we discussed LDA, its assumptions, and how the algorithm can be used to apply Bayesian statistics to both perform classification modeling and supervised dimension reduction. Finally, we covered QDA, an alternative to LDA when linear decision boundaries are not effective.

In the next chapter, we will introduce the fundamentals of time-series analysis, including an overview of the depths and limitations of this approach to answering statistical questions.

The rest of the chapter is locked

You have been reading a chapter from

Building Statistical Models in Python

Published in: Aug 2023Publisher: PacktISBN-13: 9781804614280

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Huy Hoang Nguyen

Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning.
Read more about Huy Hoang Nguyen

Paul N Adams

Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University.
Read more about Paul N Adams

Stuart J Miller

Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
Read more about Stuart J Miller

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages