Reader small image

You're reading from  Data Engineering with Python

Product typeBook
Published inOct 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781839214189
Edition1st Edition
Languages
Right arrow
Author (1)
Paul Crickard
Paul Crickard
author image
Paul Crickard

Paul Crickard authored a book on the Leaflet JavaScript module. He has been programming for over 15 years and has focused on GIS and geospatial programming for 7 years. He spent 3 years working as a planner at an architecture firm, where he combined GIS with Building Information Modeling (BIM) and CAD. Currently, he is the CIO at the 2nd Judicial District Attorney's Office in New Mexico.
Read more about Paul Crickard

Right arrow

Summary

In this chapter, you learned how to use Python to query and insert data into both relational and NoSQL databases. You also learned how to use both Airflow and NiFi to create data pipelines. Database skills are some of the most important for a data engineer. There will be very few data pipelines that do not touch on them in some way. The skills you learned in this chapter provide the foundation for the other skills you will need to learn – primarily SQL. Combining strong SQL skills with the data pipeline skills you learned in this chapter will allow you to accomplish most of the data engineering tasks you will encounter.

In the examples, the data pipelines were not idempotent. Every time they ran, you got new results, and results you did not want. We will fix that in Section 2, Deploying Pipelines into Production. But before you get to that, you will need to learn how to handle common data issues, and how to enrich and transform your data.

The next chapter will...

lock icon
The rest of the page is locked
Previous PageNext Chapter
You have been reading a chapter from
Data Engineering with Python
Published in: Oct 2020Publisher: PacktISBN-13: 9781839214189

Author (1)

author image
Paul Crickard

Paul Crickard authored a book on the Leaflet JavaScript module. He has been programming for over 15 years and has focused on GIS and geospatial programming for 7 years. He spent 3 years working as a planner at an architecture firm, where he combined GIS with Building Information Modeling (BIM) and CAD. Currently, he is the CIO at the 2nd Judicial District Attorney's Office in New Mexico.
Read more about Paul Crickard