Reader small image

You're reading from  Mastering Mesos

Product typeBook
Published inMay 2016
PublisherPackt
ISBN-139781785886249
Edition1st Edition
Tools
Right arrow
Authors (2):
Dipa Dubhashi
Dipa Dubhashi
author image
Dipa Dubhashi

Dipa Dubhashi is an alumnus of the prestigious Indian Institute of Technology and heads product management at Sigmoid. His prior experience includes consulting with ZS Associates besides founding his own start-up. Dipa specializes in envisioning enterprise big data products, developing their roadmaps, and managing their development to solve customer use cases across multiple industries. He advises several leading start-ups as well as Fortune 500 companies about architecting and implementing their next-generation big data solutions. Dipa has also developed a course on Apache Spark for a leading online education portal and is a regular speaker at big data meetups and conferences.
Read more about Dipa Dubhashi

Akhil Das
Akhil Das
author image
Akhil Das

Akhil Das is a senior software developer at Sigmoid primarily focusing on distributed computing, real-time analytics, performance optimization, and application scaling problems using a wide variety of technologies such as Apache Spark and Mesos, among others. He contributes actively to the Apache Spark project and is a regular speaker at big data conferences and meetups, MesosCon 2015 being the most recent one.
Read more about Akhil Das

View More author details
Right arrow

Chapter 9. Mesos Big Data Frameworks 2

This chapter is a guide to deploying important big data storage frameworks, such as Cassandra, the Elasticsearch-Logstash-Kibana (ELK) stack, and Kafka, on Mesos.

Cassandra on Mesos


This section will introduce Cassandra and explain how to set up Cassandra on Mesos while also discussing the problems commonly encountered during the setup process.

Introduction to Cassandra

Cassandra is an open source, scalable NoSQL database that is fully distributed with no single point of failure and is highly performant for most standard use cases. It is both horizontally as well as vertically scalable. Horizontal scalability or scale-out solution involves adding more nodes with commodity hardware to the existing cluster while vertical scalability or scale-up solution means adding more CPU and memory resources to a node with specialized hardware.

Cassandra was developed by Facebook engineers to address the inbox search use case and was inspired by Google Bigtable, which served as the foundation for its storage model, and Amazon DynamoDB, which was the foundation of its distribution model. It was open sourced in 2008 and became an Apache top-level project in early 2010...

The Elasticsearch-Logstash-Kibana (ELK) stack on Mesos


This section will introduce the Elasticsearch-Logstash-Kibana (ELK) stack and explain how to set it up on Mesos while also discussing the problems commonly encountered during the setup process.

Introduction to Elasticsearch, Logstash, and Kibana

The ELK stack, a combination of Elasticsearch, Logstash, and Kibana, is an end-to-end solution for log analytics. Elasticsearch provides search capabilities, Logstash is a log management software, while Kibana serves as the visualization layer. The stack is commercially backed by a company called Elastic.

Elasticsearch

Elasticsearch is a Lucene-based open source distributed search engine designed for high scalability and fast search query response time. It simplifies the usage of Lucene, a highly performant search engine library, by providing a powerful REST API on top. Some of the important concepts in Elasticsearch are highlighted as follows:

  • Document: This is a JSON object stored in an index...

Kafka on Mesos


This section will introduce Kafka and explain how to set it up on Mesos while also discussing the problems commonly encountered during the setup process.

Introduction to Kafka

Kafka is a distributed publish-subscribe messaging system designed for speed, scalability, reliability, and durability. Some of the key terms used in Kafka are given as follows:

  • Topics: These are the categories where message feeds are maintained by Kafka

  • Producers: These are the upstream processes that send messages to a particular Kafka topic

  • Consumers: These are the downstream processes that listen to the incoming messages in a topic and process them as per requirements

  • Broker: Each node in a Kafka cluster is called a broker

Take a look at the following high-level diagram of Kafka (source: http://kafka.apache.org/documentation.html#introduction):

A partitioned log is maintained by the Kafka cluster for every topic, which looks similar to the following (source: http://kafka.apache.org/documentation.html...

Summary


This chapter introduced the reader to some important big data storage frameworks such as Cassandra, the ELK stack, and Kafka and covered topics such as the setup, configuration, and management of these frameworks on a distributed infrastructure using Mesos.

I hope that this book has armed you with all the resources that you require to effectively manage the complexities of today's modern datacenter requirements. By following the detailed step-by-step guides to deploy a Mesos cluster using the DevOps tool of your choice, you should now be in a position to handle the system administration requirements of your organization smoothly.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Mesos
Published in: May 2016Publisher: PacktISBN-13: 9781785886249
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Dipa Dubhashi

Dipa Dubhashi is an alumnus of the prestigious Indian Institute of Technology and heads product management at Sigmoid. His prior experience includes consulting with ZS Associates besides founding his own start-up. Dipa specializes in envisioning enterprise big data products, developing their roadmaps, and managing their development to solve customer use cases across multiple industries. He advises several leading start-ups as well as Fortune 500 companies about architecting and implementing their next-generation big data solutions. Dipa has also developed a course on Apache Spark for a leading online education portal and is a regular speaker at big data meetups and conferences.
Read more about Dipa Dubhashi

author image
Akhil Das

Akhil Das is a senior software developer at Sigmoid primarily focusing on distributed computing, real-time analytics, performance optimization, and application scaling problems using a wide variety of technologies such as Apache Spark and Mesos, among others. He contributes actively to the Apache Spark project and is a regular speaker at big data conferences and meetups, MesosCon 2015 being the most recent one.
Read more about Akhil Das