Reader small image

You're reading from  Fundamentals of Analytics Engineering

Product typeBook
Published inMar 2024
PublisherPackt
ISBN-139781837636457
Edition1st Edition
Right arrow
Authors (7):
Dumky De Wilde
Dumky De Wilde
author image
Dumky De Wilde

Dumky is an award-winning analytics engineer with close to 10 years of experience in setting up data pipelines, data models and cloud infrastructure. Dumky has worked with a multitude of clients from government to fintech and retail. His background is in marketing analytics and web tracking implementations, but he has since branched out to include other areas and deliver value from data and analytics across the entire organization.
Read more about Dumky De Wilde

Fanny Kassapian
Fanny Kassapian
author image
Fanny Kassapian

Fanny has a multidisciplinary background across various industries, giving her a unique perspective on analytics workflows, from engineering pipelines to driving value for the business. As a consultant, Fanny helps companies translate opportunities and business needs into technical solutions, implement analytics engineering best practices to streamline their pipelines, and treat data as a product. She is an avid promoter of data democratization, through technology and literacy
Read more about Fanny Kassapian

Jovan Gligorevic
Jovan Gligorevic
author image
Jovan Gligorevic

Jovan, an Analytics Engineer, specializes in data modeling and building analytical dashboards. Passionate about delivering end-to-end analytics solutions and enabling self-service analytics, he has a background in business and data science. With skills ranging from machine learning to dashboarding, Jovan has democratized data across diverse industries. Proficient in various tools and programming languages, he has extensive experience with the modern data stack. Jovan enjoys providing trainings in dbt and Power BI, sharing his knowledge generously
Read more about Jovan Gligorevic

Juan Manuel Perafan
Juan Manuel Perafan
author image
Juan Manuel Perafan

Juan Manuel Perafan 8 years of experience in the realm of analytics (5 years as a consultant). Juan was the first analytics engineer hired by Xebia back in 2020. Making him one of the earliest adopters of this way of working. Besides helping his clients realizing the value of their data, Juan is also very active in the data community. He has spoken at dozens of conferences and meetups around the world (including Coalesce 2023). Additionally, he is the founder of the Analytics Engineering meetup in the Netherlands as well as the Dutch dbt meetup
Read more about Juan Manuel Perafan

Lasse Benninga
Lasse Benninga
author image
Lasse Benninga

Lasse has been working in the dataspace since 2018, starting out as a Data Engineer at a large airline, then switching towards Cloud Engineering for a consultancy and working for different clients in the retailing and healthcare space. Since 2021, he's an Analytics Engineer at Xebia Data, merging software/platform engineering with analytics passion. As a consultant Lasse has seen many different clients, ranging from retail, healthcare, ridesharing industry, and trading companies. He has implemented multiple data platforms and worked in all three major clouds, leveraging his knowledge of data and analytics to provide value
Read more about Lasse Benninga

Ricardo Angel Granados Lopez
Ricardo Angel Granados Lopez
author image
Ricardo Angel Granados Lopez

Ricardo, an Analytics Engineer with a strong background in data engineering and analysis, is a quick learner and tech enthusiast. With a Master's in IT Management specializing in Data Science, he excels in using various programming languages and tools to deliver valuable insights. Ricardo, experienced in diverse industries like energy, transport, and fintech, is adept at finding alternative solutions for optimal results. As an Analytics Engineer, he focuses on driving value from data through efficient data modeling, using best practices, automating tasks and improving data quality
Read more about Ricardo Angel Granados Lopez

Taís Laurindo Pereira
Taís Laurindo Pereira
author image
Taís Laurindo Pereira

Taís is a versatile data professional with experience in a diverse range of organizations - from big corporations to scale-ups. Before her move to Xebia, she had the chance to develop distinct data products, such as dashboards and machine learning implementations. Currently, she has been focusing on end-to-end analytics as an Analytics Engineer. With a mixed background in engineering and business, her mission is to contribute to data democratization in organizations, by helping them to overcome challenges when working with data at scale
Read more about Taís Laurindo Pereira

View More author details
Right arrow

Hands-On Analytics Engineering

Now that we’ve discussed quite a lot of theory in the previous chapters, your hands might be itching to get to work with some technology! In this chapter, you will put theory into practice by helping an old friend gain analytical insights into their fledgling company. Along the way, you will use several tools for the heavy lifting and get acquainted with different parts of the analytical engineering stack. Additionally, you can access the book’s GitHub repository, which contains step-by-step instructions for setting up the tooling that will be used in the chapter.

In this chapter, we will discuss the following topics:

  • Understanding the Stroopwafelshop use case
  • Preparing Google Cloud
  • ELT using Airbyte Cloud
  • Modeling data using data build tool (dbt) Cloud
  • Visualizing data with Tableau

Technical requirements

For this chapter, you will need the following:

Understanding the Stroopwafelshop use case

A few months ago, your friend Jan founded a business in Amsterdam. He lucked out on a prime location near the center of Amsterdam. There, Jan decided to start selling the Dutch delicacy of Stroopwafels, a cake consisting of two layers of waffles with caramel-like syrup in the middle. Tourists and Dutch residents love these traditional treats. You decide to visit him in the store and try one of his now-famous Stroopwafels over a cup of coffee. Delicious!

While the two of you talk, Jan tells you that they have gained a lot of customers in a short time – so many that they have been feeling overwhelmed lately. As the number of customers grows, so does the information load. Jan would love to get a better insight into the business’s performance, but since he is so busy managing the store, he only finds a little time to explore insights. Besides, his data is scattered over multiple systems, and he only has a little experience working...

Preparing Google Cloud

Since we will use Google BigQuery as our data warehouse, we will need a Google account to prepare BigQuery for loading the Stroopwafelshop data. For this, you will need the following:

  • A new Google Cloud Trial account with $300 in free credits. Alternatively, you can reuse your existing Google Cloud account (in that case, any costs that are incurred, although likely minimal, are at your own risk)!
  • A Google Cloud project named stroopwafelshop.
  • A BigQuery dataset named stroopwafelshopdata.
  • Three service accounts, assigned with IAM roles:
    • A service account named airbyte that’s been assigned the BigQuery Data Editor and BigQuery User roles.
    • A service account named dbt-cloud that’s also been assigned the BigQuery Data Editor and BigQuery User roles.
    • A service account for Tableau, including the BigQuery Data Viewer role for read-only access
  • For each service account, you will need to create and download a key in JSON format. These will...

Modeling data using dbt Cloud

With the raw data in BigQuery, you might want to jump right in and start uncovering those key business insights. But before you start building those queries, take a moment to think about the tools and strategies available. Remember, there is often more than one way to tackle data analysis.

The shortcomings of conventional analytics

BigQuery and platforms of its kind are built to manage massive data volumes at impressive speeds. Still, this does not iron out every hurdle for analysts. They frequently receive ad hoc requests for data queries, such as dissecting sudden changes in sales patterns, forecasting next week’s inventory requirements, or explaining yesterday’s customer behavior. Analysts address these tasks diligently aiming to enhance business insights. Often, they might pull SQL from one of their past analyses and repurpose it for a new one. This new analysis could be stored away in a Word document on a shared network drive or...

Visualizing data with Tableau

Once the data has been cleaned and transformed and is available in your data warehouse, it is time to use this data to generate insights. In this section, we will walk you through the basics of using Tableau and how you can create dashboards. Additionally, we will expand on how the Stroopwafelshop can use dashboards effectively to answer different business questions.

Why Tableau?

Tableau stands out in the business and analytics world for its straightforward but powerful data visualization features. Its ease of use appeals to a wide range of users, not just tech experts, enabling them to create engaging and interactive charts and graphs. It is versatile, working seamlessly with various data sources and offering advanced functions like trend analysis and forecasting. Tableau’s active online community is a big plus, offering learning and networking opportunities. Furthermore, constant updates keep Tableau at the forefront of data visualization...

Selecting the KPIs

You proudly tell Jan that all the tooling is now in place to create a dashboard! The most important thing now is to get his input on what to start measuring. As stated earlier in this chapter, Jan values healthy financial growth and customer satisfaction. He can also see the importance of monitoring these objectives as they evolve. After careful consideration, Jan has decided which KPIs are the most important for his business and how he is going to monitor them in a dashboard.

Here’s an overview of the KPIs that have been selected:

  • Sales revenue: The total revenue generated during a given period. This is a fundamental KPI for any retail business, providing a basic measure of its financial performance.
  • Sales volume: The total number of waffles sold during a given period. It should be tracked in total and broken down by different waffle types.
  • Gross profit: This is calculated as sales revenue minus the product’s unit price (cost of...

Summary

In this chapter, we transitioned from theory to practice by focusing on analytics engineering in a real-world context. The Stroopwafelshop case study guided you in assisting the store’s owner, Jan, with understanding and analyzing his business data. This case study served as a practical example, demonstrating how to apply analytics engineering techniques and tools in a business context.

Powerful tools such as Airbyte Cloud, Google BigQuery, dbt Cloud, and Tableau were introduced and used. This hands-on approach has not only equipped you with the necessary skills to tackle real-life analytics engineering challenges but also culminated in the creation of an insightful dashboard for the Stroopwafelshop. This combination of theory and practice has offered a glimpse into the daily activities of an analytics engineer, underscoring the importance of comprehending metrics and KPIs to add value to the business.

In the next chapter, you will delve into the importance of...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Fundamentals of Analytics Engineering
Published in: Mar 2024Publisher: PacktISBN-13: 9781837636457
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (7)

author image
Dumky De Wilde

Dumky is an award-winning analytics engineer with close to 10 years of experience in setting up data pipelines, data models and cloud infrastructure. Dumky has worked with a multitude of clients from government to fintech and retail. His background is in marketing analytics and web tracking implementations, but he has since branched out to include other areas and deliver value from data and analytics across the entire organization.
Read more about Dumky De Wilde

author image
Fanny Kassapian

Fanny has a multidisciplinary background across various industries, giving her a unique perspective on analytics workflows, from engineering pipelines to driving value for the business. As a consultant, Fanny helps companies translate opportunities and business needs into technical solutions, implement analytics engineering best practices to streamline their pipelines, and treat data as a product. She is an avid promoter of data democratization, through technology and literacy
Read more about Fanny Kassapian

author image
Jovan Gligorevic

Jovan, an Analytics Engineer, specializes in data modeling and building analytical dashboards. Passionate about delivering end-to-end analytics solutions and enabling self-service analytics, he has a background in business and data science. With skills ranging from machine learning to dashboarding, Jovan has democratized data across diverse industries. Proficient in various tools and programming languages, he has extensive experience with the modern data stack. Jovan enjoys providing trainings in dbt and Power BI, sharing his knowledge generously
Read more about Jovan Gligorevic

author image
Juan Manuel Perafan

Juan Manuel Perafan 8 years of experience in the realm of analytics (5 years as a consultant). Juan was the first analytics engineer hired by Xebia back in 2020. Making him one of the earliest adopters of this way of working. Besides helping his clients realizing the value of their data, Juan is also very active in the data community. He has spoken at dozens of conferences and meetups around the world (including Coalesce 2023). Additionally, he is the founder of the Analytics Engineering meetup in the Netherlands as well as the Dutch dbt meetup
Read more about Juan Manuel Perafan

author image
Lasse Benninga

Lasse has been working in the dataspace since 2018, starting out as a Data Engineer at a large airline, then switching towards Cloud Engineering for a consultancy and working for different clients in the retailing and healthcare space. Since 2021, he's an Analytics Engineer at Xebia Data, merging software/platform engineering with analytics passion. As a consultant Lasse has seen many different clients, ranging from retail, healthcare, ridesharing industry, and trading companies. He has implemented multiple data platforms and worked in all three major clouds, leveraging his knowledge of data and analytics to provide value
Read more about Lasse Benninga

author image
Ricardo Angel Granados Lopez

Ricardo, an Analytics Engineer with a strong background in data engineering and analysis, is a quick learner and tech enthusiast. With a Master's in IT Management specializing in Data Science, he excels in using various programming languages and tools to deliver valuable insights. Ricardo, experienced in diverse industries like energy, transport, and fintech, is adept at finding alternative solutions for optimal results. As an Analytics Engineer, he focuses on driving value from data through efficient data modeling, using best practices, automating tasks and improving data quality
Read more about Ricardo Angel Granados Lopez

author image
Taís Laurindo Pereira

Taís is a versatile data professional with experience in a diverse range of organizations - from big corporations to scale-ups. Before her move to Xebia, she had the chance to develop distinct data products, such as dashboards and machine learning implementations. Currently, she has been focusing on end-to-end analytics as an Analytics Engineer. With a mixed background in engineering and business, her mission is to contribute to data democratization in organizations, by helping them to overcome challenges when working with data at scale
Read more about Taís Laurindo Pereira