About this book
Apache Beam is an open source unified programming model to define and execute multiple data processing pipelines, including extract, transform, and load (ETL), batch, and stream processing.
This book will help you to confidently build data processing pipelines with Apache Beam. You’ll start with an overview of Apache Beam and understand how to implement basic pipelines using it. The book covers various techniques to load data, perform transformations and store the data. You will also learn how to test and run the pipelines effectively. As you progress, you will explore how to implement your own Domain Specific Language (DSL)and also get to grips with using Euphoria DSL. Later chapters will show you how to query your data using SQL before progressing to run a pipeline using a portable runner. Finally, you will learn advanced Apache Beam concepts such as IO connectors and R.
By the end of this Apache book, you will be able to confidently implement batch and streaming data pipelines using Apache Beam.
- Publication date:
- November 2021