Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Multi-Cloud Strategy for Cloud Architects - Second Edition

You're reading from  Multi-Cloud Strategy for Cloud Architects - Second Edition

Product type Book
Published in Apr 2023
Publisher Packt
ISBN-13 9781804616734
Pages 470 pages
Edition 2nd Edition
Languages
Author (1):
Jeroen Mulder Jeroen Mulder
Profile icon Jeroen Mulder

Table of Contents (23) Chapters

Preface 1. Introduction to Multi-Cloud 2. Collecting Business Requirements 3. Starting the Multi-Cloud Journey 4. Service Designs for Multi-Cloud 5. Managing the Enterprise Cloud Architecture 6. Controlling the Foundation Using Well-Architected Frameworks 7. Designing Applications for Multi-Cloud 8. Creating a Foundation for Data Platforms 9. Creating a Foundation for IoT 10. Managing Costs with FinOps 11. Maturing FinOps 12. Cost Modeling in the Cloud 13. Implementing DevSecOps 14. Defining Security Policies 15. Implementing Identity and Access Management 16. Defining Security Policies for Data 17. Implementing and Integrating Security Monitoring 18. Developing for Multi-Cloud with DevOps and DevSecOps 19. Introducing AIOps and GreenOps in Multi-Cloud 20. Conclusion: The Future of Multi-Cloud 21. Other Books You May Enjoy
22. Index

Choosing the right platform for data

It is a cliché, but nonetheless it’s also very true: data is the new gold. It’s for good reasons that in enterprise architecture frameworks data is named as the first thing that a business must do is to analyse what data it should use and how to gain optimal benefits from data. No business can operate without data: it needs the data to gain insights into markets and the demands of their customers. It needs data to drive the business.

You will find the term data-driven in almost every cloud assessment study. What does data-driven mean? A company makes decisions based on the analysis of data. Intuition or decision based on previous experience are ruled out. Every action is supported by the analysis of data.

To enable a data-driven business, we need one thing: the data itself, and typically in vast amounts and preferably (near) real-time. The collection of data is prerequisite number one. Prerequisite number two is that this data must...

Building and sizing a data platform

As with every service that we deploy in cloud, we need something to build a platform on a foundation. Hence, building a landing zone that can hold raw data is the first step. This landing zone should be an environment that serves only one purpose: to capture raw data. It’s recommended to build this landing zone separate from core IT systems. It should be scalable, but at low-cost, since it will hold a lot of data. The issue with keeping data is that it might increase the cloud bill exponentially. Data storage comes at a very low price per unit of data, but the catch is that we need a lot of these small units.

Important is to implement governance from the start. This includes defining and implementing guardrails for classification of data and tagging.

Once the landing zone has been established, data analysts can start using the data lake as a sandbox environment. This is the second stage. Analysts can start building prototypes of data models and...

Designing for interoperability and portability

Portability and interoperability should be driven by use and business cases – not for the pure sake of portability or interoperability. In IT-systems there are four levels that define portability of systems: data, applications, platforms, and infrastructure, following the Architecture Development Method (ADM) of TOGAF.

Data represents information in such form that it can be processed by computers. Data is stored in storage that is accessible to computers.

Applications is software that performs actions that are triggered by business requests.

Platforms support the applications.

Infrastructure is a collection of computation, storage, and network resources. Computation can also refer to cloud computing including virtual machines, containers and serverless functions.

One important note that we have to make at this point, is that cloud computing does cause the effect of ‘blurring’ in the demarcation of infrastructure, platforms...

Overcoming challenges of data gravity

Applications don’t just hold data, but they also produce a lot of data that they share with other applications. Data will attract new data and services in other applications. As data accumulates, more and more applications and services will use it. Data and applications are attracted to each other, as in the law of gravity. To put it short and simply: the amounts of data will grow, either autonomously, but likely because data sources will be connected to other data sources.

In addition to a strategic advantage of having access to this data, this also presents a major challenge. Databases are becoming so large that it becomes almost impossible to move the data. This can lead to the situation that companies are tied to a certain location to hold that data. In addition, companies that use each other's data and services must stay close to each other in order to provide good service. By keeping data physically close together, it can be exchanged...

Managing the foundation for data lakes

Data engineers design, build and manage the data pipelines, but the foundation of the data lake and data warehouse is the specific landing zone for the data platform. Typically, landing zones in cloud are operated by cloud engineers who take care of the compute, storage, and network resources.

Looking at management of data platforms, we can distinguish various roles:

  • Data architect or engineer: the architect and data engineer are often combined in one role. The role is responsible for design, development, and deployment of the data pipelines. The engineer must have extensive knowledge of ETL or ELT principles and technologies, making sure that data from sources get collected and transformed into usable datasets in data warehouses or other data products where the data can be further analyzed. Data also needs to be validated, which is a required skill of the engineer too. In essence, the engineer makes sure that data that is ingested into warehouses...

Summary

In this chapter, we discussed the basic architecture principles to build and manage a data platform. We looked at data lakes that can hold vast amounts of raw data and how we can build these lakes on top of cloud storage. The next step is to fetch the right data that is usable in data models. We must extract, transfer and load – ETL or ELT for short - the accurate data sets in environments where data analysts can work with this data. Typically, data warehouses are used for this.

We studied the various propositions for data operations of the major cloud providers AWS, Azure, Google Cloud, Alibaba, and Oracle. Next, we discussed the challenges that come with building and operating data platforms. There will be challenges with respect to access to data, accuracy, but also privacy and compliancy. Data gravity is another problem that we must solve. It’s not easy to move huge amounts of data across platform, hence we must find other solutions to work with data in different...

Questions

  1. What does the term ETL mean?
  2. What would be the first step in building a data platform?
  3. True or false: data lakes are typically built on the common storage layers of major cloud providers such as Azure blob storage and Amazon S3.
  4. What does Oracle’s GoldenGate do?

Further reading

  • Data Lake for Enterprises, by Tomcy John and Pankaj Misra, Packt Publishing
lock icon The rest of the chapter is locked
You have been reading a chapter from
Multi-Cloud Strategy for Cloud Architects - Second Edition
Published in: Apr 2023 Publisher: Packt ISBN-13: 9781804616734
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at AU $19.99/month. Cancel anytime}