Build scalable data pipelines using Apache Spark and Delta Lake
Automate workflows and manage data governance with Unity Catalog
Learn real-time processing and structured streaming with practical use cases
Implement CI/CD, DevOps, and security for production-ready data solutions
Explore Databricks-native ML, AutoML, and Generative AI integration
Description
"Data Engineering with Azure Databricks" is your essential guide to building scalable, secure, and high-performing data pipelines using the powerful Databricks platform on Azure. Designed for data engineers, architects, and developers, this book demystifies the complexities of Spark-based workloads, Delta Lake, Unity Catalog, and real-time data processing.
Beginning with the foundational role of Azure Databricks in modern data engineering, you’ll explore how to set up robust environments, manage data ingestion with Auto Loader, optimize Spark performance, and orchestrate complex workflows using tools like Azure Data Factory and Airflow.
The book offers deep dives into structured streaming, Delta Live Tables, and Delta Lake’s ACID features for data reliability and schema evolution. You’ll also learn how to manage security, compliance, and access controls using Unity Catalog, and gain insights into managing CI/CD pipelines with Azure DevOps and Terraform.
With a special focus on machine learning and generative AI, the final chapters guide you in automating model workflows, leveraging MLflow, and fine-tuning large language models on Databricks. Whether you're building a modern data lakehouse or operationalizing analytics at scale, this book provides the tools and insights you need.
Who is this book for?
This book is for data engineers, solution architects, cloud professionals, and software engineers seeking to build robust and scalable data pipelines using Azure Databricks. Whether you're migrating legacy systems, implementing a modern lakehouse architecture, or optimizing data workflows for performance, this guide will help you leverage the full power of Databricks on Azure. A basic understanding of Python, Spark, and cloud infrastructure is recommended.
What you will learn
Set up a full-featured Azure Databricks environment
Implement batch and streaming ingestion using Auto Loader
Optimize Spark jobs with partitioning and caching
Build real-time pipelines with structured streaming and DLT
Manage data governance using Unity Catalog
Orchestrate production workflows with jobs and ADF
Apply CI/CD best practices with Azure DevOps and Git
Secure data with RBAC, encryption, and compliance standards
Dmitry Foshin is a Business Intelligence team leader focused on delivering business insights to the management team through data engineering, analytics, and visualization. He has led and executed complex full-stack BI solutions (from ETL processes to building DWHs and reporting) using Azure technologies, Data Lake, Data Factory, Data Bricks, MS Office 365, Power BI, and Tableau. He has also successfully launched numerous data analytics projects – both on-premises and in the cloud – that help achieve corporate goals for international FMCG companies, banks, and manufacturing companies.
Dmitry Anoshin is a data-centric technologist and a recognized expert in building and implementing big data and analytics solutions. He has a successful track record of implementing business and digital intelligence projects across retail, finance, marketing, and e-commerce. Dmitry possesses in-depth knowledge of digital/business intelligence, ETL, data warehousing, and big data technologies. He has extensive experience in data integration and is proficient in various data warehousing methodologies. Dmitry has consistently exceeded project expectations across the financial, machine tool, and retail industries. He has completed a number of multinational full BI/DI solution life cycle implementation projects. With expertise in data modeling, Dmitry also has a background and business experience in multiple relational databases, OLAP systems, and NoSQL databases. He is also an active speaker at data conferences and helps people to adopt cloud analytics.
Tonya Chernyshova is an experienced Data Engineer with over 10 years in the field, including time at Amazon. Specializing in Data Modeling, Automation, Cloud Computing (AWS and Azure), and Data Visualization, she has a strong track record of delivering scalable, maintainable data products. Her expertise drives data-driven insights and business growth, showcasing her proficiency in leveraging cloud technologies to enhance data capabilities.
Sergii Volodarskyi is a Data Engineer working daily on the Databricks ecosystem, across both platform and product data engineering. His expertise spans the full spectrum of modern data platform delivery, from designing lakehouse architectures and building CI/CD pipelines to extracting data from APIs and shipping analytical products that drive business decisions. This book is a reflection of experience and best practices built up across real projects. He actively shares his knowledge with the engineering community and is a builder with a deep interest in the intersection of data, AI, and software engineering.
What is the digital copy I get with my Print order?
When you buy any Print edition of our Books, you can redeem (for free) the eBook edition of the Print Book you’ve purchased. This gives you instant access to your book when you make an order via PDF, EPUB or our online Reader experience.
What is the delivery time and cost of print book?
Shipping Details
USA:
'
Economy: Delivery to most addresses in the US within 10-15 business days
Premium: Trackable Delivery to most addresses in the US within 3-8 business days
UK:
Economy: Delivery to most addresses in the U.K. within 7-9 business days. Shipments are not trackable
Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days! Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands
EU:
Premium: Trackable delivery to most EU destinations within 4-9 business days.
Australia:
Economy: Can deliver to P. O. Boxes and private residences. Trackable service with delivery to addresses in Australia only. Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro Delivery time is up to 15 business days for remote areas of WA, NT & QLD.
Premium: Delivery to addresses in Australia only Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.
India:
Premium: Delivery to most Indian addresses within 5-6 business days
Rest of the World:
Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days
Asia:
Premium: Delivery to most Asian addresses within 5-9 business days
Disclaimer: All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.
Unfortunately, due to several restrictions, we are unable to ship to the following countries:
Afghanistan
American Samoa
Belarus
Brunei Darussalam
Central African Republic
The Democratic Republic of Congo
Eritrea
Guinea-bissau
Iran
Lebanon
Libiya Arab Jamahriya
Somalia
Sudan
Russian Federation
Syrian Arab Republic
Ukraine
Venezuela
What is custom duty/charge?
Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.
Do I have to pay customs charges for the print book order?
The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.
A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.
How do I know my custom duty charges?
The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.
For example:
If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order?
Cancellation Policy for Published Printed Books:
You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.
Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.
What is your returns and refunds policy?
Return Policy:
We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:
If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.
On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.
What tax is charged?
Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.
What payment methods can I use?
You can pay with the following card types:
Visa Debit
Visa Credit
MasterCard
PayPal
What is the delivery time and cost of print books?
Shipping Details
USA:
'
Economy: Delivery to most addresses in the US within 10-15 business days
Premium: Trackable Delivery to most addresses in the US within 3-8 business days
UK:
Economy: Delivery to most addresses in the U.K. within 7-9 business days. Shipments are not trackable
Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days! Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands
EU:
Premium: Trackable delivery to most EU destinations within 4-9 business days.
Australia:
Economy: Can deliver to P. O. Boxes and private residences. Trackable service with delivery to addresses in Australia only. Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro Delivery time is up to 15 business days for remote areas of WA, NT & QLD.
Premium: Delivery to addresses in Australia only Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.
India:
Premium: Delivery to most Indian addresses within 5-6 business days
Rest of the World:
Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days
Asia:
Premium: Delivery to most Asian addresses within 5-9 business days
Disclaimer: All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.
Unfortunately, due to several restrictions, we are unable to ship to the following countries: