Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Database Design and Modeling with Google Cloud

You're reading from  Database Design and Modeling with Google Cloud

Product type Book
Published in Dec 2023
Publisher Packt
ISBN-13 9781804611456
Pages 234 pages
Edition 1st Edition
Languages
Concepts
Author (1):
Abirami Sukumaran Abirami Sukumaran
Profile icon Abirami Sukumaran

Table of Contents (18) Chapters

Preface Part 1:Database Model: Business and Technical Design Considerations
Chapter 1: Data, Databases, and Design Chapter 2: Handling Data on the Cloud Part 2:Structured Data
Chapter 3: Database Modeling for Structured Data Chapter 4: Setting Up a Fully Managed RDBMS Chapter 5: Designing an Analytical Data Warehouse Part 3:Semi-Structured, Unstructured Data, and NoSQL Design
Chapter 6: Designing for Semi-Structured Data Chapter 7: Unstructured Data Management Part 4:DevOps and Databases
Chapter 8: DevOps and Databases Part 5:Data to AI
Chapter 9: Data to AI – Modeling Your Databases for Analytics and ML Chapter 10: Looking Ahead – Designing for LLM Applications Index Other Books You May Enjoy

Designing an Analytical Data Warehouse

In the previous chapter, we discussed hands-on database design for structured data and design considerations. Then, we learned how to set up, configure, and create database objects, connect to the database, and access data programmatically. In this chapter, we will move on to designing for analytical data and start hands-on learning with a fully managed cloud data warehouse. In doing so, you will learn how to set up and configure databases, create datasets and objects, and query and perform sample analytics on data.

In this chapter, we’ll cover the following topics:

  • Understanding how data warehouses are different from databases
  • The significance of ETL in data warehouses
  • BigQuery hands-on
  • Summary of operational aspects and design considerations

Understanding how data warehouses are different from databases

Data warehousing is the process of storing and managing data from multiple sources in a centralized location. This makes it easier to analyze the data and gain insights that can be used to improve business operations. Data warehouse applications are designed to support mainly, but not limited to, online analytical processing (OLAP) data models, which are centered around the dimensions of data that the business organizes. Such modes of data warehouse tend to facilitate data aggregation for summary, drilling down into details, slicing and dicing across dimensions, querying for analysis, and making business decisions.

In contrast to the database applications that support transactional models (OLTP), which are intended for a specific functional application or product of the business, data warehouses support models that are meant for the entire business and organization, not just for a specific group of users (access control...

Significance of ETL in data warehouse

ETL is a process in data warehousing that represents extract, transform, and load. This process involves extracting data from multiple sources, transforming and performing computations, cleansing for data quality, and loading the data into a target system. ETL is important for data warehouses to collect, read, process, transform, migrate, and analyze data from several disparate sources into one target database or warehouse. ETL eliminates silos in sources and integrates data for easy access and BI.

The ETL process typically consists of the following steps:

  • Extract/ingest: Data is extracted from the source systems. This can be done using a variety of methods, such as database queries, file transfers, or APIs.
  • Transform: The data is transformed into a format that is compatible with the target system. This may involve cleaning the data, converting data types, or merging data from multiple sources.
  • Load: The data is loaded into the...

Learning about BigQuery

BigQuery is a fully managed, completely serverless (no need to think about the underlying infrastructure), cost-effective, petabyte-scale enterprise data warehouse with built-in machine learning and BI analytics that works across clouds and sources and scales with your data. It is an important service for the transform (process, analyze) stage in the life cycle of your data.

An entire book can be dedicated to BigQuery and it still wouldn’t be sufficient to cover everything that BigQuery has to offer for your data and businesses in its fullest sense. We will take a look at the key features in the next section and try some of these hands-on in this chapter.

Features of BigQuery

In this section, I have tried to address the features that database practitioners and developers associate with the most:

  • With BigQuery, you can unify structured, unstructured, and semi-structured data and perform queries on them together. This means you can also...

Summary of operational aspects and design considerations

When you are evaluating a cloud data warehouse or analytical storage for your application, make sure you design for the following key operational aspects:

  • Data warehouse migration: Ensure the choice and configuration for your data warehouse supports easy migration of your data to the cloud data warehouse in case your application requires it.
  • Transferring data: Consider the need to efficiently move and synchronize data between different systems and platforms. The format and way you ingest data into BigQuery is an important aspect, as we discussed earlier.
  • Data governance and security options: Make sure your choice of data warehouse or analytics system allows you to implement robust security measures and governance practices to protect your data.
  • Real-time and predictive analytics: While designing an analytics solution, also consider enabling real-time data processing and predictive analytics for actionable...

Summary

In this chapter, we covered the topic of designing for analytical data, understanding the differences between data warehouses and databases, and the importance of data warehouses with real-world use cases. We outlined the core characteristics of data warehouses, highlighting their analytical focus, data integration capabilities, and comprehensive data insights. We also discussed the significance of ETL in a data warehouse application and introduced a few cloud services for ETL, including Cloud Dataflow, Cloud Data Fusion, Cloud Data Catalog, Cloud Dataproc, and BigQuery, all of which aim to provide efficient data movement and transformation capabilities.

We focused on BigQuery, a fully managed serverless data warehouse that provides advanced features such as data unification, built-in machine learning, AI collaboration, real-time analytics, and robust security as it is a powerful tool for handling analytical workloads. We discussed it with a hands-on guide on setting up...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Database Design and Modeling with Google Cloud
Published in: Dec 2023 Publisher: Packt ISBN-13: 9781804611456
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}