Reader small image

You're reading from  Building Statistical Models in Python

Product typeBook
Published inAug 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781804614280
Edition1st Edition
Languages
Concepts
Right arrow
Authors (3):
Huy Hoang Nguyen
Huy Hoang Nguyen
author image
Huy Hoang Nguyen

Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning.
Read more about Huy Hoang Nguyen

Paul N Adams
Paul N Adams
author image
Paul N Adams

Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University.
Read more about Paul N Adams

Stuart J Miller
Stuart J Miller
author image
Stuart J Miller

Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
Read more about Stuart J Miller

View More author details
Right arrow

What this book covers

Chapter 1, Sampling and Generalization, describes the concepts of sampling and generalization. The discussion of sampling covers several common methods for sampling data from a population and discusses the implications for generalization. This chapter also discusses how to setup the software required for this book.

Chapter 2, Distributions of Data, provides a detailed introduction to types of data, common distributions used to describe data, and statistical measures. This chapter also covers common transformations used to change distributions.

Chapter 3, Hypothesis Testing, introduces the concept of statistical tests as a method for answering questions of interest. This chapter covers the steps to perform a test, the types of errors encountered in testing, and how to select power using the Z-test.

Chapter 4, Parametric Tests, further discusses statistical tests, providing detailed descriptions of common parametric statistical tests, the assumptions of parametric tests, and how to assess the validity of parametric tests. This chapter also introduces the concept of multiple tests and provides details on corrections for multiple tests.

Chapter 5, Non-parametric Tests, discuss how to perform statistical tests when the assumptions of parametric tests are violated with class of tests without assumptions called non-parametric tests.

Chapter 6, Simple Linear Regression, introduces the concept of a statistical model with the simple linear regression model. This chapter begins by discussing the theoretical foundations of simple linear regression and then discusses how to interpret the results of the model and assess the validity of the model.

Chapter 7, Multiple Linear Regression, builds on the previous chapter by extending the simple linear regression model into additional dimensions. This chapter also discusses issues that occur when modeling with multiple explanatory variables, including multicollinearity, feature selection, and dimension reduction.

Chapter 8, Discrete Models, introduces the concept of classification and develops a model for classifying variables into discrete levels of a categorical response variable. This chapter starts by developing the model binary classification and then extends the model to multivariate classification. Finally, the Poisson model and negative binomial models are covered.

Chapter 9, Discriminant Analysis, discusses several additional models for classification, including linear discriminant analysis and quadratic discriminant analysis. This chapter also introduces Bayes’ Theorem.

Chapter 10, Introduction to Time Series, introduces time series data, discussing the time series concept of autocorrelation and the statistical measures for time series. This chapter also introduces the white noise model and stationarity.

Chapter 11, ARIMA Models, discusses models for univariate models. This chapter starts by discussing models for stationary time series and then extends the discussion to non-stationary time series. Finally, this chapter provides a detailed discussion on model evaluation.

Chapter 12, Multivariate Time Series, builds on the previous two chapters by introducing the concept of a multivariate time series and extends ARIMA models to multiple explanatory variables. This chapter also discusses time series cross-correlation.

Chapter 13, Survival Analysis, introduces survival data, also called time-to-event data. This chapter discusses the concept of censoring and the impact of censoring survival data. Finally, the chapter discusses the survival function, hazard, and hazard ratio.

Chapter 14, Survival Models, building on the previous chapter, provides an overview of several models for survival data, including the Kaplan-Meier model, the Exponential model, and the Cox Proportional Hazards model.

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Building Statistical Models in Python
Published in: Aug 2023Publisher: PacktISBN-13: 9781804614280

Authors (3)

author image
Huy Hoang Nguyen

Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning.
Read more about Huy Hoang Nguyen

author image
Paul N Adams

Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University.
Read more about Paul N Adams

author image
Stuart J Miller

Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
Read more about Stuart J Miller