Reader small image

You're reading from  R for Data Science Cookbook (n)

Product typeBook
Published inJul 2016
Reading LevelIntermediate
Publisher
ISBN-139781784390815
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Yu-Wei, Chiu (David Chiu)
Yu-Wei, Chiu (David Chiu)
author image
Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Read more about Yu-Wei, Chiu (David Chiu)

Right arrow

Detecting missing data


There are numerous causes behind missing data. For example, it could be the result of typos or data process flaws. However, if there is missing data in our analysis process, the results of the analysis may be misleading. Thus, it is important to detect missing values before proceeding with further analysis.

Getting ready

Refer to the Converting data types recipe and convert each attribute of imported data into the proper data type. Also, rename the columns of the employees and salaries datasets by following the steps from the Renaming the data variable recipe.

How to do it…

Perform the following steps to detect missing values:

  1. First, we set the to_date attribute with a date over 2100-01-01:

    > salaries[salaries$to_date > "2100-01-01",]
    
  2. We then change the data with a date over 2100-01-01 to a missing value:

    > salaries[salaries$to_date > "2100-01-01","to_date"] = NA
    
  3. Next, we can use the is.na function to find which rows contain missing values:

    > is.na(salaries...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
R for Data Science Cookbook (n)
Published in: Jul 2016Publisher: ISBN-13: 9781784390815

Author (1)

author image
Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Read more about Yu-Wei, Chiu (David Chiu)