Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Ingestion with Python Cookbook

You're reading from  Data Ingestion with Python Cookbook

Product type Book
Published in May 2023
Publisher Packt
ISBN-13 9781837632602
Pages 414 pages
Edition 1st Edition
Languages
Author (1):
Gláucia Esppenchutz Gláucia Esppenchutz
Profile icon Gláucia Esppenchutz

Table of Contents (17) Chapters

Preface Part 1: Fundamentals of Data Ingestion
Chapter 1: Introduction to Data Ingestion Chapter 2: Principals of Data Access – Accessing Your Data Chapter 3: Data Discovery – Understanding Our Data before Ingesting It Chapter 4: Reading CSV and JSON Files and Solving Problems Chapter 5: Ingesting Data from Structured and Unstructured Databases Chapter 6: Using PySpark with Defined and Non-Defined Schemas Chapter 7: Ingesting Analytical Data Part 2: Structuring the Ingestion Pipeline
Chapter 8: Designing Monitored Data Workflows Chapter 9: Putting Everything Together with Airflow Chapter 10: Logging and Monitoring Your Data Ingest in Airflow Chapter 11: Automating Your Data Ingestion Pipelines Chapter 12: Using Data Observability for Debugging, Error Handling, and Preventing Downtime Index Other Books You May Enjoy

Storing log files in a remote location

By default, Airflow stores and organizes its logs in a local folder with easy access for developers, which facilitates the debugging process when something does not go as expected. However, working with larger projects or teams makes giving everyone access to an Airflow instance or server almost impracticable. Besides looking at the DAG console output, there are other ways to allow access to the logging folder without granting access to Airflow’s server.

One of the most straightforward solutions is to export logs to external storage, such as S3 or Google Cloud Storage. The good news is that Airflow already has native support to export records to cloud resources.

In this recipe, we will set a configuration in our airflow.cfg file that allows the use of the remote logging feature and test it using an example DAG.

Getting ready

Refer to the Technical requirements section for this recipe.

AWS S3

To complete this exercise,...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}