Reader small image

You're reading from  Building ETL Pipelines with Python

Product typeBook
Published inSep 2023
PublisherPackt
ISBN-139781804615256
Edition1st Edition
Right arrow
Authors (2):
Brij Kishore Pandey
Brij Kishore Pandey
author image
Brij Kishore Pandey

Brij Kishore Pandey stands as a testament to dedication, innovation, and mastery in the vast domains of software engineering, data engineering, machine learning, and architectural design. His illustrious career, spanning over 14 years, has seen him wear multiple hats, transitioning seamlessly between roles and consistently pushing the boundaries of technological advancement. He has a degree in electrical and electronics engineering. His work history includes the likes of JP Morgan Chase, American Express, 3M Company, Alaska Airlines, and Cigna Healthcare. He is currently working as a principal software engineer at Automatic Data Processing Inc. (ADP). Originally from India, he resides in Parsippany, New Jersey, with his wife and daughter.
Read more about Brij Kishore Pandey

Emily Ro Schoof
Emily Ro Schoof
author image
Emily Ro Schoof

Emily Ro Schoof is a dedicated data specialist with a global perspective, showcasing her expertise as a data scientist and data engineer on both national and international platforms. Drawing from a background rooted in healthcare and experimental design, she brings a unique perspective of expertise to her data analytic roles. Emily's multifaceted career ranges from working with UNICEF to design automated forecasting algorithms to identify conflict anomalies using near real-time media monitoring to serving as a subject matter expert for General Assembly's Data Engineering course content and design. Her mission is to empower individuals to leverage data for positive impact. Emily holds the strong belief that providing easy access to resources that merge theory and real-world applications is the essential first step in this process.
Read more about Emily Ro Schoof

View More author details
Right arrow

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter (now, X) handles. Here is an example: “For example, let’s write a simple data pipeline that imports three CSVs, traffic_crashes.csv, traffic_crash_vehicle.csv, and traffic_crash_people.csv, as input data.”

A block of code is set as follows:

# Merge the three dataframes into a single dataframemerge_01_df = pd.merge(df, df2, on='CRASH_RECORD_ID')
all_data_df = pd.merge(merge_01_df, df3, on='CRASH_RECORD_ID')

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

aws lambda create-function  -s3-key-function ReverseStringFunction \ --zip-file fileb:// s3_key_function.zip --handler lambda_function. lambda_handler \ 
--runtime python3.8 --role <YOUR IAM ARN ROLE>

Any command-line input or output is written as follows:

psql -U postgres

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the Administration panel.”

Tips or important notes

Appear like this.

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Building ETL Pipelines with Python
Published in: Sep 2023Publisher: PacktISBN-13: 9781804615256

Authors (2)

author image
Brij Kishore Pandey

Brij Kishore Pandey stands as a testament to dedication, innovation, and mastery in the vast domains of software engineering, data engineering, machine learning, and architectural design. His illustrious career, spanning over 14 years, has seen him wear multiple hats, transitioning seamlessly between roles and consistently pushing the boundaries of technological advancement. He has a degree in electrical and electronics engineering. His work history includes the likes of JP Morgan Chase, American Express, 3M Company, Alaska Airlines, and Cigna Healthcare. He is currently working as a principal software engineer at Automatic Data Processing Inc. (ADP). Originally from India, he resides in Parsippany, New Jersey, with his wife and daughter.
Read more about Brij Kishore Pandey

author image
Emily Ro Schoof

Emily Ro Schoof is a dedicated data specialist with a global perspective, showcasing her expertise as a data scientist and data engineer on both national and international platforms. Drawing from a background rooted in healthcare and experimental design, she brings a unique perspective of expertise to her data analytic roles. Emily's multifaceted career ranges from working with UNICEF to design automated forecasting algorithms to identify conflict anomalies using near real-time media monitoring to serving as a subject matter expert for General Assembly's Data Engineering course content and design. Her mission is to empower individuals to leverage data for positive impact. Emily holds the strong belief that providing easy access to resources that merge theory and real-world applications is the essential first step in this process.
Read more about Emily Ro Schoof