Reader small image

You're reading from  Apache Kafka 1.0 Cookbook

Product typeBook
Published inDec 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781787286849
Edition1st Edition
Languages
Tools
Right arrow
Authors (2):
Raúl Estrada
Raúl Estrada
author image
Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada

View More author details
Right arrow

Chapter 5. The Confluent Platform

This chapter covers the following recipes:

  • Installing the Confluent Platform
  • Using Kafka operations
  • Monitoring with the Confluent Control Center
  • Using the Schema Registry
  • Using the Kafka REST Proxy
  • Using Kafka Connect

Introduction


The Confluent Platform is a full stream data system. It enables you to organize and manage data from several sources in one high-performance and reliable system. As mentioned in the first few chapters, the goal of an enterprise service bus is not only to provide the system a means to transport messages and data but also to provide all the tools that are required to connect the data origins (data sources), applications, and data destinations (data sinks) to the platform.

The Confluent Platform has these parts:

  • Confluent Platform open source
  • Confluent Platform enterprise
  • Confluent Cloud

The Confluent Platform open source has the following components:

  • Apache Kafka core
  • Kafka Streams
  • Kafka Connect
  • Kafka clients
  • Kafka REST Proxy
  • Kafka Schema Registry

The Confluent Platform enterprise has the following components:

  • Confluent Control Center
  • Confluent support, professional services, and consulting

All the components are open source except the Confluent Control Center, which is a proprietary of Confluent...

Installing the Confluent Platform


In order to use the REST proxy and the Schema Registry, we need to install the Confluent Platform. Also, the Confluent Platform has important administration, operation, and monitoring features fundamental for modern Kafka production systems.

Getting ready

At the time of writing this book, the Confluent Platform Version is 4.0.0.

Currently, the supported operating systems are:

  • Debian 8
  • Red Hat Enterprise Linux
  • CentOS 6.8 or 7.2
  • Ubuntu 14.04 LTS and 16.04 LTS

macOS currently is just supported for testing and development purposes, not for production environments. Windows is not yet supported. Oracle Java 1.7 or higher is required.

The default ports for the components are:

  • 2181: Apache ZooKeeper
  • 8081: Schema Registry (REST API)
  • 8082: Kafka REST Proxy
  • 8083: Kafka Connect (REST API)
  • 9021: Confluent Control Center
  • 9092: Apache Kafka brokers

It is important to have these ports, or the ports where the components are going to run, open.

How to do it...

There are two ways to install...

Using Kafka operations


With the Confluent Platform installed, the administration, operation, and monitoring of Kafka become very simple. Let's review how to operate Kafka with the Confluent Platform.

Getting ready

For this recipe, Confluent should be installed, up, and running.

How to do it...

The commands in this section should be executed from the directory where the Confluent Platform is installed:

  1. To start ZooKeeper, Kafka, and the Schema Registry with one command, run:
$ confluent start schema-registry

The output of this command should be:

Starting zookeeperzookeeper is [UP]Starting kafkakafka is [UP]Starting schema-registryschema-registry is [UP]

Note

To execute the commands outside the installation directory, add Confluent's bin directory to PATH:export PATH=<path_to_confluent>/bin:$PATH

  1. To manually start each service with its own command, run:
$ ./bin/zookeeper-server-start ./etc/kafka/zookeeper.properties$ ./bin/kafka-server-start ./etc/kafka/server.properties$ ./bin/schema-registry-start...

Monitoring with the Confluent Control Center


This recipe shows you how to use the metrics reporter of the Confluent Control Center.

Getting ready

The execution of the previous recipe is needed.

Before starting the Control Center, configure the metrics reporter:

  1. Back up the server.properties file located at:
<confluent_path>/etc/kafka/server.properties
  1. In the server.properties file, uncomment the following lines:
metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter 
confluent.metrics.reporter.bootstrap.servers=localhost:9092 
confluent.metrics.reporter.topic.replicas=1 
  1. Back up the Kafka Connect configuration located in:
<confluent_path>/etc/schema-registry/connect-avro-distributed.properties
  1. Add the following lines at the end of the connect-avro-distributed.properties file:
consumer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor 
producer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor...

Using the Schema Registry


The Schema Registry is a repository. It is a metadata-serving layer for schemas. It provides a REST interface for storing and retrieving Avro schemas. It has a versioned history of schemas and provides compatibility analysis to leverage schema evolution based on that compatibility.

Remember that the Schema Registry has a REST interface; so, in this recipe, we use Java to make HTTP requests, but it is precisely a REST interface used to promote language and platform neutrality.

Getting ready

The Confluent Platform should be up and running:

$ confluent start schema-registry

How to do it...

Remember the Customer sees BTC price Avro schema of Doubloon:

{ "name": "customer_sees_btcprice", 
  "namespace": "doubloon.avro", 
  "type": "record", 
  "fields": [ 
    { "name": "event", "type": "string" }, 
    { "name": "customer",  
      "type": { 
          "name": "id", "type": "long", 
          "name": "name", "type": "string", 
          "name": "ipAddress", "type": "string...

Using the Kafka REST Proxy


What happens if we want to use Kafka in an environment that is not yet supported? Think in terms of something such as JavaScript, PHP, and so on.

For this and other programming challenges, the Kafka REST Proxy provides a RESTful interface to a Kafka cluster.

From a REST interface, one can produce and consume messages, view the state of the cluster, and perform administrative actions without using the native Kafka protocol or clients.

The example use cases are:

  • Sending data to Kafka from a frontend app built in a non-supported language (yes, think of the JavaScript and PHP fronts, for example).
  • The need to communicate with Kafka from an environment that doesn't support Kafka (think in terms of mainframes and legacy systems).
  • Scripting administrative actions. Think of a DevOps team in charge of a Kafka system and a sysadmin who doesn't know the supported languages (Java, Scala, Python, Go, or C/C++).

Getting ready

The Confluent Platform should be up and running:

$ confluent...

Using Kafka Connect


As mentioned, Kafka Connect is a framework used to connect Kafka with external systems such as key-value stores (think of Riak, Coherence, and Dynamo), databases (Cassandra), search indexes (Elastic), and filesystems (HDFS).

In this book, there is a whole chapter about Kafka connectors, but this recipe is part of the Confluent Platform.

Getting ready

The Confluent Platform should be up and running:

$ confluent log connect

How to do it...

To read a data file with Kafka Connect:

  1. To list the installed connectors:
$ confluent list connectors 
Bundled Predefined Connectors (edit configuration under etc/): 
   elasticsearch-sink 
   file-source 
   file-sink 
   jdbc-source 
   jdbc-sink 
   hdfs-sink 
   s3-sink
  1. The configuration file is located at ./etc/kafka/connect-file-source.properties. It has these values:
    • The instance name:
name=file_source 
    • The implementer class:
connector.class=FileStreamSource 
    • The number of tasks of this connector instance:
tasks.max=1
    • The input file:
file=continuous...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Apache Kafka 1.0 Cookbook
Published in: Dec 2017Publisher: PacktISBN-13: 9781787286849
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada