Data Analysis and Exploration with Pandas [Video]
-
Free ChapterPandas Foundations
- The Course Overview
- Dissecting the Anatomy of a DataFrame
- Accessing the Main DataFrame Components
- Understanding Data Types
- Selecting a Single Column of Data as a Series
- Calling Series Methods
- Working with Operators on a Series
- Chaining Series Methods Together
- Making the Index Meaningful
- Renaming Row and Column Names
- Creating and Deleting Columns
-
Essential DataFrame Operations
- Selecting Multiple DataFrame Columns
- Selecting Columns with Methods
- Ordering Column Names Sensibly
- Operating on the Entire DataFrame
- Chaining DataFrame Methods Together
- Working with Operators on a DataFrame
- Comparing Missing Values
- Transposing the Direction of a DataFrame
- Determining College Campus Diversity
-
Beginning Data Analysis
-
Selecting Subsets of Data
-
Boolean Indexing
- Calculating Boolean Statistics
- Constructing Multiple Boolean Conditions
- Filtering with Boolean Indexing
- Replicating Boolean Indexing with Index Selection
- Selecting with Unique and Sorted Indexes
- Gaining Perspective on Stock Prices
- Translating SQL WHERE Clauses
- Determining the Normality of Stock Market Returns
- Improving Readability of Boolean Indexing with the Query Method
- Preserving Series with the WHERE Method
- Masking DataFrame Rows
- Selecting with Booleans, Integer Location, and Labels
-
Index Alignment
-
Grouping for Aggregation, Filtration, and Transformation
- Defining an Aggregation
- Grouping and Aggregating with Multiple Columns and Functions
- Removing the MultiIndex After Grouping
- Customizing an Aggregation Function
- Customizing Aggregating Functions with *args and **kwargs
- Examining the groupby Object
- Filtering for States with a Minority Majority
- Transforming through a Weight Loss Bet
- Calculating Weighted Mean SAT Scores Per State with Apply
- Grouping By Continuous Variables
- Counting the Total Number of Flights Between Cities
- Finding the Longest Streak of On-Time Flights
-
Restructuring Data into a Tidy Form
- Tidying Variable Values as Column Names with Stack
- Tidying Variable Values as Column Names with Melt
- Stacking Multiple Groups of Variables Simultaneously
- Inverting Stacked Data
- Unstacking After a groupby Aggregation
- Replicating pivot_table with a groupby Aggregation
- Renaming Axis Levels for Easy Reshaping
- Tidying When Multiple Variables are Stored as Column Names
- Tidying When Multiple Variables are Stored as Column Values
- Tidying When Two or More Values are Stored in the Same Cell
- Tidying When Variables are Stored in Column Names and Values
- Tidying When Multiple Observational Units are Stored in the Same Table
-
Combining Pandas Objects
Are you looking for a gigantic boost in your productivity? Are you searching for some interesting and fun tricks to solve your data problems? If so, then this course is indeed a perfect choice for you. This course provides you with unique, idiomatic, and amazing solutions for both fundamental and advanced data manipulation tasks with pandas.
Some solutions focus on achieving a deeper understanding of basic principles, or comparing and contrasting two similar operations. A few others will delve into a particular dataset, and let you uncover new and unexpected insights along the way.
The pandas library is massive, and it's common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands as one would do during an actual analysis. This course guides you, as if you were looking over the shoulder of an expert, through practical situations that you are highly likely to encounter. Many advanced solutions combine several different features across the pandas library to generate results.
The code bundle for the video course is available at - https://github.com/PacktPublishing/Data-Analysis-and-Exploration-with-Pandas
Style and Approach
This course includes interesting and illustrative examples and delivers very detailed explanations for each line of code in all of the examples. All code and dataset explanations exist in Jupyter Notebooks, an excellent interface for exploring data. In other words, this is an easy guide with a problem/solution approach for real-world datasets.
- Publication date:
- May 2018
- Publisher
- Packt
- Duration
- 5 hours 12 minutes
- ISBN
- 9781789343205