Reader small image

You're reading from  The Statistics and Machine Learning with R Workshop

Product typeBook
Published inOct 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781803240305
Edition1st Edition
Languages
Right arrow
Author (1)
Liu Peng
Liu Peng
author image
Liu Peng

Peng Liu is an Assistant Professor of Quantitative Finance (Practice) at Singapore Management University and an adjunct researcher at the National University of Singapore. He holds a Ph.D. in statistics from the National University of Singapore and has ten years of working experience as a data scientist across the banking, technology, and hospitality industries.
Read more about Liu Peng

Right arrow

Statistical inference for categorical data

A categorical variable has distinct categories or levels, rather than numerical values. Categorical data is common in our daily lives, such as gender (male or female, although a modern view may differ), type of property sales (new property or resale), and industry. The ability to make sound inferences about these variables is thus essential for drawing meaningful conclusions and making well-informed decisions in diverse contexts.

Being a categorical variable often means we cannot pass it to a machine learning (ML) model without additional preprocessing. Take the industry variable, for example. Instead of passing the categorical values (string values such as "finance" or "technology") to the model, a common approach is to one-hot encode the variable into multiple columns, with each column corresponding to a specific industry, indicating a binary value of 0 or 1.

In this section, we will explore various statistical...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
The Statistics and Machine Learning with R Workshop
Published in: Oct 2023Publisher: PacktISBN-13: 9781803240305

Author (1)

author image
Liu Peng

Peng Liu is an Assistant Professor of Quantitative Finance (Practice) at Singapore Management University and an adjunct researcher at the National University of Singapore. He holds a Ph.D. in statistics from the National University of Singapore and has ten years of working experience as a data scientist across the banking, technology, and hospitality industries.
Read more about Liu Peng