Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
The Definitive Guide to Data Integration

You're reading from  The Definitive Guide to Data Integration

Product type Book
Published in Mar 2024
Publisher Packt
ISBN-13 9781837631919
Pages 490 pages
Edition 1st Edition
Languages
Authors (4):
Pierre-Yves BONNEFOY Pierre-Yves BONNEFOY
Profile icon Pierre-Yves BONNEFOY
Emeric CHAIZE Emeric CHAIZE
Profile icon Emeric CHAIZE
Raphaël MANSUY Raphaël MANSUY
Profile icon Raphaël MANSUY
Mehdi TAZI Mehdi TAZI
Profile icon Mehdi TAZI
View More author details

Table of Contents (19) Chapters

Preface 1. Chapter 1: Introduction to Our Data Integration Journey 2. Chapter 2: Introducing Data Integration 3. Chapter 3: Architecture and History of Data Integration 4. Chapter 4: Data Sources and Types 5. Chapter 5: Columnar Data Formats and Comparisons 6. Chapter 6: Data Storage Technologies and Architectures 7. Chapter 7: Data Ingestion and Storage Strategies 8. Chapter 8: Data Integration Techniques 9. Chapter 9: Data Transformation and Processing 10. Chapter 10: Transformation Patterns, Cleansing, and Normalization 11. Chapter 11: Data Exposition and APIs 12. Chapter 12: Data Preparation and Analysis 13. Chapter 13: Workflow Management, Monitoring, and Data Quality 14. Chapter 14: Lineage, Governance, and Compliance 15. Chapter 15: Various Architecture Use Cases 16. Chapter 16: Prospects and Challenges 17. Index 18. Other Books You May Enjoy

What this book covers

Chapter 1, Introduction to Our Data Integration Journey, explores data integration’s evolution and significance, discussing the proliferation of data sources and the evolving landscape. It tackles the complexities and opportunities in modern data integration and outlines the book’s purpose and vision.

Chapter 2, Introducing Data Integration, covers the definition of data integration, the modern data stack, and strategies in data integration. It details the role of data in businesses and examines the techniques, tools, and technologies used in data integration processes.

Chapter 3, Architecture and History of Data Integration, traces the history of data integration, the impact of open source technologies, and various architectures. It discusses the future of data integration, highlighting trends such as real-time and AI-driven integrations.

Chapter 4, Data Sources and Types, discusses the variety of data sources including relational and NoSQL databases, flat files, and APIs. It also explores different data types and formats, emphasizing their importance and challenges in data integration processes.

Chapter 5, Columnar Data Formats and Comparisons, focuses on columnar data formats, contrasting them with traditional row-based methods, emphasizing their advantages in analytics. It explores the challenges of working with different data formats and the necessity of data format conversion.

Chapter 6, Data Storage Technologies and Architectures, delves into data storage technologies such as data warehouses, lakes, and object storage, discussing their strengths and weaknesses. It also covers various data architectures and their impact on data integration, including physical and logical layers, data modeling, and partitioning.

Chapter 7, Data Ingestion and Storage Strategies, covers the goals and strategies of data ingestion, outlining efficient, scalable, and adaptable methods for diverse data sources. It also discusses data storage and modeling techniques, and strategies for optimizing storage performance and defining adapted strategies.

Chapter 8, Data Integration Techniques, explores different data integration models and architectures, covering point-to-point integration, middleware, batch, micro-batching, and real-time approaches. It also discusses common data integration patterns such as ETL and ELT and organizational models for data management.

Chapter 9, Data Transformation and Processing, introduces various data transformation techniques including filters, aggregations, and joins. It delves into SQL’s role in data transformation and massively parallel processing systems, discussing their applications and challenges in data processing.

Chapter 10, Transformation Patterns, Cleansing, and Normalization, explores transformation patterns such as lambda and kappa architectures, their pros and cons, and their applications in data pipelines. It delves into data cleansing and normalization, which are crucial for good data quality and consistency in integration.

Chapter 11, Data Exposition and APIs, covers strategic motives for data exposure in analytics, seamless data exchange, and the role of various data exposition technologies. It focuses on APIs and strategies for data exposure, and compares different data exposure solutions.

Chapter 12, Data Preparation and Analysis, discusses the importance of data preparation, strategies for selecting data transformations, and key concepts in reporting and self-analysis, all of which are crucial for effective decision-making and business insights.

Chapter 13, Workflow Management, Monitoring, and Data Quality, examines workflow and event management, monitoring in data stacks, the significance of data quality and observability, and data governance and compliance in managing data assets.

Chapter 14, Lineage, Governance, and Compliance, explores the significance of data lineage in decision-making and compliance, techniques for visualizing data journeys, and the importance of adhering to regulations with robust governance frameworks.

Chapter 15, Various Architecture Use Cases, discusses data integration in scenarios such as real-time data analysis, cloud-based, geospatial, and IoT data analysis, covering the specific challenges, tools, and techniques for each use case.

Chapter 16, Prospects and Challenges, focuses on the future of data integration within the modern data stack, highlighting emerging trends, challenges, and opportunities, and provides guidance for further learning in data integration.

lock icon The rest of the chapter is locked
Next Chapter arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}