Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Python Data Analysis
Python Data Analysis

Python Data Analysis: An end-to-end guide covering data processing, data manipulation and data visualization , Fourth Edition

Arrow left icon
Profile Icon Avinash Navlani Profile Icon Cornellius Yudha Wijaya
Arrow right icon
Early Access Early Access Publishing in Jul 2026
£24.29 £26.99
eBook Jul 2026 4th Edition
eBook
£24.29 £26.99
Paperback
£33.99
Subscription
Free Trial
Renews at £16.99p/m
Arrow left icon
Profile Icon Avinash Navlani Profile Icon Cornellius Yudha Wijaya
Arrow right icon
Early Access Early Access Publishing in Jul 2026
£24.29 £26.99
eBook Jul 2026 4th Edition
eBook
£24.29 £26.99
Paperback
£33.99
Subscription
Free Trial
Renews at £16.99p/m
eBook
£24.29 £26.99
Paperback
£33.99
Subscription
Free Trial
Renews at £16.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB format
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Python Data Analysis

Join our book community on Discord

Image

https://packt.link/EarlyAccessCommunity

Python has grown to be one of the most widely used, strong, and industry-standard programming languages in recent years, providing a comprehensive set of tools for data science tasks. Numerous libraries, including NumPy, Pandas, SciPy, Statsmodel, Scikit-Learn, Matplotlib, Seaborn, Bokeh, Plotly, NLTK, SpaCy, OpenCV, and Dask, are part of the Python ecosystem. For data scientists, data engineers, business analysts, ML engineers, NLP engineers, and data analysts, these libraries offer a comprehensive package for data analysis, visualization, and forecasting. Other benefits of Python include its ease of learning, open-source nature, object-oriented, dynamically typed, high-level, rapid development, vibrant community, and capacity to handle intricate data science, statistical, and mathematical applications. Because of all these features, it is the best option for data analysis.

Data analytics is the future...

Navigating the landscape of data analysis

We are living in a digital world where our daily activities, government operations, and business operations generate huge amounts of data. The world’s data is growing at 20% each year, which means that every four years we are doubling the data. We are sitting on a huge mountain of data. Our day-to-day activities such as social media, entertainment, online learning, online business, and government activities add more data. Most organizations worldwide want to utilize and generate value from this data to make them healthier, more productive, more efficient, and more sustainable places. Mathematicians, statisticians, and computer engineers are developing statistical, data mining, machine learning, natural language, and image processing applications for generating insights and faster decision-making.

Data analysis is an interdisciplinary field that comprises statistical and computing fundamentals for powerful decision-making for individual...

Data analysis process methodologies

In the last decade, there has been a huge growth in the data analysis field. Lots of efforts are being made to establish the standard methodologies for data analysis and data analysis-based application development. In this section, we will discuss various process methodologies such as KDD, SEMMA, CRISP-DM, and the standard process. These methodologies have few overlapping or similar steps with different objectives.

Knowledge discovery from data (KDD)

Knowledge Discovery from Data is what KDD stands for. Data mining is also known by the term KDD. The practice of discovering and utilizing patterns for knowledge discovery is known as data mining. Finding hidden patterns in the data sources that are provided is the primary objective of the KDD process. There are seven main phases to the KDD process:

  • Data cleaning: In this first stage, we handle the noisy data, missing values, duplicates, and outliers in the given dataset.
  • Data integration: Then, data migration...

Compare Data Analysis, Data Science and Data Engineering

Data analysis is the process of exploring and investigating data to discover hidden facts and patterns that can serve business and policy-making decisions. It is a sub-field of data science that explores data to find insights. Data analysis tools and techniques are widely engaged in various business, government, and social domains by data analysts, business analysts, statisticians, social scientists, data scientists, researchers, and consultants. The main purpose of data analysis is to optimize business operations, profits, and effectiveness. The process of data analysis is to query and collect the relevant data from various sources, explore and understand the data, visualize the findings, prepare dashboards and reports, and present it in the form of a story to the senior management for decision-making.

In contrast, data science is a multidisciplinary field that combines domain expertise, computer science, and statistics. Data...

Installing Python 3

We can easily download the installer file for installing Python 3 from the official website of Python (https://www.python.org/downloads/) for Windows, Linux, Mac, and other 32-bit or 64-bit systems. By Double-clicking on this installer, we can easily install it on your computer. Additionally, the Python installation offers the "IDLE" Integrated Development Environment (IDE), which is useful for development. In the upcoming sections of this chapter, we will go deep into Python installation of each of the operating systems.

Python installation and setup on Windows

This book is based on the latest version of Python 3. All the coding examples used in this book are written in Python 3 version, so before starting the hands-on coding, we have to install Python 3. Python is an open-source, easy-to-learn, object-oriented, and distributed language. It is also licensed for commercial use. There are many implementations of Python, including commercial implementations...

Software tools used in this book

In this section, we will see all the IDE and software that we need to use in this book. A Python program can be easily written on any editor and executed on any system that already has Python installed. W A program can be written in Notepad, TextEdit, or any other editor, and it can be run via the terminal or command prompt. We can also use various DEs such as Spyder, VS Code, Jupyter Notebook, PyCharm, and Atom. In this book, we will primarily use the Anaconda IDE for all the data analysis tasks. Anaconda is a freely available open-source, enterprise-ready, and provides a solid foundation for Python-based data analysis. Also, it offers several Python libraries for data analysis, including NumPy, SciPy, Pandas, Scikit-learn, and so on. Anaconda can easily be downloaded and installed, as follows:

  1. Installer can be downloaded from the official Anaconda website. https://www.anaconda.com/download.
  2. Download the installer as per your operating system.
  3. Execute...

Summary

In this introductory chapter, we covered data analysis frameworks or process model, such as KDD, SEMMA, CRISP-DM, and standard process for data analysis. The job responsibilities and skill sets of data scientists, data engineers, ML engineers, data analysts, and NLP engineers were then covered. Next, we installed the packages that we will use throughout this book: NumPy, SciPy, Pandas, Matplotlib, IPython, Jupyter Notebook, Anaconda, Jupyter Lab, PyCharm, and VS Code. Installing Anaconda or Jupyter Lab, which comes with NumPy, Pandas, SciPy, and Scikit-learn integrated in, is a better option than installing all those modules. Next, we successfully implemented a vector addition application and discovered that NumPy performs better than the other libraries. We looked through the available documentation and learning resources on the nternet. We also talked about Pycharm, VS Code, Databricks, Jupyter Notebook, Jupyter Lab, and their features.

In the next chapter, Chapter 2, NumPy...

Left arrow icon Right arrow icon

Key benefits

  • Prepare and clean your data to use it for exploratory analysis, data manipulation, and data wrangling
  • Discover supervised, unsupervised, probabilistic, and Bayesian machine learning methods
  • Get to grips with graph processing and sentiment analysis

Description

Data analysis enables you to generate value from small and big data by discovering new patterns, and Python is one of the most popular tools for analyzing a wide variety of data. With this book, you'll get up and running using Python for data analysis by exploring the different phases used in data analysis and learning how to use modern libraries from the Python ecosystem to create efficient data pipelines. Starting with the essential statistical and data analysis fundamentals using Python, you'll perform complex data analysis and modeling, data manipulation, data cleaning, and data visualization using easy-to-follow examples. You'll then understand how to conduct time series analysis and signal processing using ARMA models. As you advance, you'll get to grips with smart processing and data analytics using machine learning algorithms such as regression, classification, Principal Component Analysis (PCA), and clustering. You'll also work on real-world examples to analyze textual and image data using natural language processing (NLP) and image analytics techniques, respectively. Finally, the book will demonstrate parallel computing using Dask. By the end of this data analysis book, you'll be equipped with the skills you need to prepare data for analysis and create meaningful data visualizations for forecasting values from data.

Who is this book for?

This book is for data analysts, business analysts, statisticians, and data scientists looking to learn how to use Python for data analysis. Students and academic faculties will also find this book useful for learning and teaching Python data analysis using a hands-on approach. A basic understanding of math and working knowledge of the Python programming language will help you get started with this book.

What you will learn

  • Prepare, clean, and transform your data for exploratory analysis, manipulation, and wrangling.
  • Explore concepts in signal processing, time series analysis, and predictive analytics.
  • Understand and apply key machine learning techniques, including supervised, unsupervised, probabilistic, and Bayesian methods.
  • Work with graph data and perform sentiment analysis.
  • Handle large-scale image and text analytics efficiently.
  • Accelerate data manipulation using Dask, Modin, and Ray.
  • Perform scalable big data analytics with PySpark.

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jul 24, 2026
Edition : 4th
Language : English
ISBN-13 : 9781806022861
Category :
Languages :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB format
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jul 24, 2026
Edition : 4th
Language : English
ISBN-13 : 9781806022861
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
£16.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
£169.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just £5 each
Feature tick icon Exclusive print discounts
£234.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just £5 each
Feature tick icon Exclusive print discounts

Table of Contents

5 Chapters
Welcome to Packt Early Access Chevron down icon Chevron up icon
Chapter 1: Getting Started with Python Libraries Chevron down icon Chevron up icon
Chapter 2: NumPy and pandas Chevron down icon Chevron up icon
Chapter 3: Statistics for Data Insights Chevron down icon Chevron up icon
Chapter 4: Linear Algebra Chevron down icon Chevron up icon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.

Modal Close icon
Modal Close icon