Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Wrangling with SQL

You're reading from  Data Wrangling with SQL

Product type Book
Published in Jul 2023
Publisher Packt
ISBN-13 9781837630028
Pages 350 pages
Edition 1st Edition
Languages
Authors (2):
Raghav Kandarpa Raghav Kandarpa
Profile icon Raghav Kandarpa
Shivangi Saxena Shivangi Saxena
Profile icon Shivangi Saxena
View More author details

Table of Contents (21) Chapters

Preface 1. Part 1:Data Wrangling Introduction
2. Chapter 1: Database Introduction 3. Chapter 2: Data Profiling and Preparation before Data Wrangling 4. Part 2:Data Wrangling Techniques Using SQL
5. Chapter 3: Data Wrangling on String Data Types 6. Chapter 4: Data Wrangling on the DATE Data Type 7. Chapter 5: Handling NULL Values 8. Chapter 6: Pivoting Data Using SQL 9. Part 3:SQL Subqueries, Aggregate And Window Functions
10. Chapter 7: Subqueries and CTEs 11. Chapter 8: Aggregate Functions 12. Chapter 9: SQL Window Functions 13. Part 4:Optimizing Query Performance
14. Chapter 10: Optimizing Query Performance 15. Part 5:Data Science And Wrangling
16. Chapter 11: Descriptive Statistics with SQL 17. Chapter 12: Time Series with SQL 18. Chapter 13: Outlier Detection 19. Index 20. Other Books You May Enjoy

What is data wrangling?

Data wrangling is the process of cleaning, transforming, and organizing dirty data into clean data that can be used to generate powerful insights to enable stakeholders to make the right decisions. It is basically the process of removing errors in data and making it ready for analysis. As the amount of data is growing exponentially throughout the world, it is becoming more and more important to store and organize these large datasets properly. Real-world data is often quite messy and unstructured, hence it needs to be cleaned before it can be used for any analysis.

Figure 2.1 – Data wrangling

Figure 2.1 – Data wrangling

Let’s look at a few examples of data wrangling:

  • Cleaning dirty data, such as missing values, bad characters, unmatched data types, and bad formatting into consistent and clean data
  • Combining different datasets from multiple sources and making sure data is consistent
  • Deleting data that is no longer required
  • ...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}