Reader small image

You're reading from  Modern Data Architectures with Python

Product typeBook
Published inSep 2023
Reading LevelExpert
PublisherPackt
ISBN-139781801070492
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Brian Lipp
Brian Lipp
author image
Brian Lipp

Brian Lipp is a Technology Polyglot, Engineer, and Solution Architect with a wide skillset in many technology domains. His programming background has ranged from R, Python, and Scala, to Go and Rust development. He has worked on Big Data systems, Data Lakes, data warehouses, and backend software engineering. Brian earned a Master of Science, CSIS from Pace University in 2009. He is currently a Sr. Data Engineer working with large Tech firms to build Data Ecosystems.
Read more about Brian Lipp

Right arrow

Kafka architecture

Kafka is an open source distributed streaming platform designed to scale to impressive levels. Kafka can store as much data as you have storage for, but it shouldn’t be used as a database. Kafka’s core architecture is composed of five main ideas – topics, brokers, partitions, producers, and consumers.

Topics

For a developer, a Kafka topic is the most important concept to understand. Topics are where data is “stored.” Topics hold data often called events, which means that the data has a key and a value. Keys in this context are not related to a database key that defines uniqueness, but they can be used for organizational purposes. The value is the data itself, which can be in a few different formats such as strings, JSON, Avro, and Protobuf. When your data is written to Kafka, it will have metadata; the most important will be the timestamp.

Working with Kafka can be confusing because the data isn’t stored in a database...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Modern Data Architectures with Python
Published in: Sep 2023Publisher: PacktISBN-13: 9781801070492

Author (1)

author image
Brian Lipp

Brian Lipp is a Technology Polyglot, Engineer, and Solution Architect with a wide skillset in many technology domains. His programming background has ranged from R, Python, and Scala, to Go and Rust development. He has worked on Big Data systems, Data Lakes, data warehouses, and backend software engineering. Brian earned a Master of Science, CSIS from Pace University in 2009. He is currently a Sr. Data Engineer working with large Tech firms to build Data Ecosystems.
Read more about Brian Lipp