Reader small image

You're reading from  Modern Data Architecture on AWS

Product typeBook
Published inAug 2023
PublisherPackt
ISBN-139781801813396
Edition1st Edition
Concepts
Right arrow
Author (1)
Behram Irani
Behram Irani
author image
Behram Irani

Behram Irani is currently a technology leader with Amazon Web Services (AWS) specializing in data, analytics and AI/ML. He has spent over 18 years in the tech industry helping organizations, from start-ups to large-scale enterprises, modernize their data platforms. In the last 6 years working at AWS, Behram has been a thought leader in the data, analytics and AI/ML space; publishing multiple papers and leading the digital transformation efforts for many organizations across the globe. Behram has completed his Bachelor of Engineering in Computer Science from the University of Pune and has an MBA degree from the University of Florida.
Read more about Behram Irani

Right arrow

Preface

Many IT leaders and professionals know how to get data in a particular type of database and derive value from it. But when it comes to creating an enterprise-wide holistic data platform with purpose-built data services, all seamlessly working in tandem with the least amount of manual intervention, it is always challenging to design and implement such a platform.

This book covers end-to-end solutions of many of the common data, analytics and AI/ML use-cases that organizations want to solve using AWS services. The book systematically lays out all the building blocks of a modern data platform including data lake, data warehouse, data ingestion patterns, data consumption patterns, data governance and AI/ML patterns. Using real world use-cases, each chapter highlights the features and functionalities of many of the AWS services to create a scalable, flexible, performant and cost-effective modern data platform.

By the end of this book, readers will be equipped with all the necessary architecture patterns and would be able to apply this knowledge to build a modern data platform for their organization using AWS services.

Who this book is for

This book is specifically geared towards helping data architects, data engineers and those professionals involved with building data platforms. The use-case driven approach in this book helps them conceptualize possible solutions to specific use-cases and provides them with design patterns to build data platforms for any organization.

Technical leaders and decision makers would also benefit from this book as they will get a perspective of what the overall data architecture looks like for their organization and how each component of the platform helps with their business needs.

What this book covers

Prologue, Data and Analytics Journey so far, provides a historical context around what a data platform looks like in the on-prem world. In this prologue we will discuss the traditional platform components and talk about their benefits; then pivot towards their shortcomings in meetings new business objectives. This will provide context for the need to build a modern data architecture.

Chapter 1, Modern Data Architecture on AWS, describes what it means to create a modern data architecture. We will also look at how AWS services help materialize this concept and why it is important to create this foundation for current and future business needs.

Chapter 2, Scalable Data Lakes, lays down the foundation of the modern data architecture by establishing a data lake on AWS. We will also look at different layers of the data lake and how each layer has a specific purpose.

Chapter 3, Batch Data Ingestion, provides options to move data in batches from multiple source systems into AWS. We will explore different AWS services that assist in migrating data in bulk from variety of source systems.

Chapter 4, Streaming Data Ingestion, provides an overview of the need for a real-time streaming architecture pattern and how AWS services assist in solving use-cases that require streaming data ingested and consumed in the modern data platform.

Chapter 5, Data Processing, provides options to process and transform data, so that it can eventually be consumed for analytics. We will look at some AWS services that help provide scalable, performant and cost-effective big data processing; especially for running Apache Spark based workloads.

Chapter 6, Interactive Analytics, provides insights around ad-hoc analytics use-cases along with AWS services that help solve it.

Chapter 7, Data Warehousing, covers a wide range of use-cases that can be solved using a modern cloud data warehouse on AWS. We will look at multiple design patterns, including data ingestion, data transformation and data consumption using the data warehouse on AWS.

Chapter 8, Data Sharing, provides context around how data can be shared within a modern data platform, without creating complete ETL pipelines and without duplicating data at multiple places.

Chapter 9, Data Federation, provides mechanisms of data federation and the types of use-cases that can be solved using federated queries.

Chapter 10, Predictive Analytics, covers a whole range of use-cases along with services, features and tools provided by AWS to solve AI, ML and deep learning-based business problems; with the common goal of achieving predictive analytics.

Chapter 11, Generative AI, provides variety of use-cases across multiple industries that can be solved using GenAI and how AWS provides services and tools to help fast-track building GenAI based applications.

Chapter 12, Operational Analytics, introduces the need for operational analytics, especially log analytics and how AWS helps with this aspect of the data platform.

Chapter 13, Business Intelligence, provides context around the need for a modern business intelligent tool for creating business friendly reports and dashboards, that support rich visualizations. We will look at how AWS helps with such use-cases.

Chapter 14, Data Governance, lays ground work for the need for a unified data governance and covers many dimensions of data governance along with AWS services that assist in solving for those use-cases.

Chapter 15, Data Mesh, introduces the concept of a data mesh along with its importance in the modern data platform. We will look at the pillars of data mesh and provide AWS services that help solve use-cases that require a data mesh pattern.

Chapter 16, Performant and Cost-Effective Data Platform, covers a wide range of options to ensure the data platform built using AWS services is cost-effective as well as performant.

Chapter 17, Automate, Operationalize and Monetize, wraps up the book with concepts around automating the data platform using DevOps, DataOps and MLOps mechanisms. Finally, we will look at options to monetize the modern data platform built on AWS.

To get the most out of this book

The book is geared towards data professionals who are eager to build modern data platform using many of the AWS data and analytics services. A basic understanding of data & analytics architectures and systems is desirable along with beginner’s level understanding of AWS Cloud.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system.”

A block of code is set as follows:

INSERT INTO processed_cloudtrail_table
SELECT *
FROM raw_cloudtrail_table
WHERE conditions;

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the Administration panel.”

Use-cases

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packtpub.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Modern Data Architecture on AWS, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781801813396

  1. Submit your proof of purchase
  2. That’s it! We’ll send your free PDF and other benefits to your email directly
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Modern Data Architecture on AWS
Published in: Aug 2023Publisher: PacktISBN-13: 9781801813396
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Behram Irani

Behram Irani is currently a technology leader with Amazon Web Services (AWS) specializing in data, analytics and AI/ML. He has spent over 18 years in the tech industry helping organizations, from start-ups to large-scale enterprises, modernize their data platforms. In the last 6 years working at AWS, Behram has been a thought leader in the data, analytics and AI/ML space; publishing multiple papers and leading the digital transformation efforts for many organizations across the globe. Behram has completed his Bachelor of Engineering in Computer Science from the University of Pune and has an MBA degree from the University of Florida.
Read more about Behram Irani