Reader small image

You're reading from  Fast Data Processing Systems with SMACK Stack

Product typeBook
Published inDec 2016
Reading LevelIntermediate
PublisherPackt
ISBN-139781786467201
Edition1st Edition
Languages
Right arrow
Author (1)
Raúl Estrada
Raúl Estrada
author image
Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada

Right arrow

Chapter 9. Study Case 3 - Mesos and Docker

In this chapter, we'll analyze how to develop Mesos frameworks and we'll study how to use Mesos containerizers and Docker containerizers.

This chapter has the following parts:

  • Mesos frameworks API
  • Mesos containerizers
  • Docker containerizers

Mesos frameworks API


As we saw in the Chapter 6, The Manager - Apache Mesos, a Mesos framework is a layer between Mesos and the application, and is used for managing task scheduling and execution. As the framework implementation is specific to the application, the term is usually used to refer to the application.

Initially, the Mesos framework only could communicate with the Mesos API using the libmesos written on C++. So, in order to interact with Java, Scala, Python, and Go, one had to develop language bindings using libmesos. Since Mesos version 0.19.0, an HTTP-based protocol has been included to enable developers to build frameworks in the language of their choice without having to work with a C++ wrapper.

A Mesos framework has two components:

  • Scheduler: Responsible for taking decisions on resource offerings and tracking the current cluster state. The SchedulerDriver module has these responsibilities: to handle the communication with the Mesos master, to register frameworks with the master...

Spark Mesos run modes


Spark can run over Mesos in two modes: coarse-grained (default) and fine-grained (deprecated).

Coarse-grained

In coarse-grained mode, each Spark executor runs as a single Mesos task. Spark executors are sized according to the following configuration variables:

  • Executor memory: spark.executor.memory
  • Executor cores: spark.executor.cores
  • Number of executors: spark.cores.max/spark.executor.cores

Executors are brought up when the application starts, until spark.cores.max is reached. If spark.cores.max is not set, the Spark application will reserve all resources offered to it by Mesos, so it is highly recommended to set this variable in any sort of multi-tenant cluster, including those running multiple concurrent Spark applications.

The scheduler will start the executors round-robin on the offers Mesos gives it, but there are no spread guarantees, as Mesos does not provide such guarantees on the offer stream.

The benefit of the coarse-grained mode is a much lower startup overhead...

Apache Mesos API


Mesos has an API so that developers can build custom frameworks to run on top of the infrastructure. As you can imagine, Mesos implements the actor model for message passing, because the complexity increases without a non-blocking communication. This also leverages protocol buffers.

Scheduler HTTP API

Support for the new HTTP API was introduced in the Mesos version 0.24. The Mesos Master has the /api/v1/scheduler endpoint which the scheduler communicates.

For detailed information about the scheduler HTTP API, go to:

http://mesos.apache.org/documentation/latest/scheduler-http-api/

Requests

The master accepts the following request calls.

SUBSCRIBE

To enable communication with the master, the scheduler sends a SUBSCRIBE message via HTTP POST. The response contains the subscription confirmation and the framework ID to continue the conversation.

The SUBSCRIBE JSON request structure is:

POST /api/v1/scheduler HTTP/1.1 { 
  "type" : "SUBSCRIBE", 
  "force" : true, 
  ...

Mesos containerizers


In this section we'll learn about containers and Docker. We will expose an overview of the options for Mesos containerizers. We'll also discuss advanced topics such as networking and cache.

Containers

A Linux container, for simplicity called simply container, is an environment to run applications with a common share of resources. As all the containers share the host machine operating system, their creation is very fast.

Container technology based on operating system virtualization has been around since 2004. In OS level virtualization the OS kernel creates many user containers.

The containers are individual encapsulated components running isolated instances on the same kernel. It's easy for the developer to just package the application and its dependencies into a container and deploy it in another environment.

In today's DevOps era, our portable applications are easy to update and move between environments. If we have a desktop development environment, the move to the production...

Docker containerizers


We can define Docker as an open source platform that automates application deployment. We define a container as a portable, lightweight, auto sufficient module to be deployed on a container platform.

Docker is based on the Linux Container (LXC) model. Docker is not just an underlying technology; it acts as an abstraction layer to package and containerize applications and dependencies. Imagine Docker containers as shipping containers for applications and dependencies.

Docker has three main benefits:

  • Agility: The application development is done quickly and efficiently, because the IT Ops department also has flexible interaction
  • Control: We have a version control, to ensure code authorship
  • Portability: We can move from a single developer environment to a cloud environment immediately

Docker is a DevOps tool, reconciling the Development team and the IT Operations team:

  • Development team: Concerned with what is shipped on the container: code, data, apps, libraries
  • IT Operations:...

Summary


In this study case, we've covered the Mesos API for framework development. We've also studied how to use Mesos and Docker containerizers.

The Mesos framework version 1.0.0 was released on 07/29/2016 so it's a very new technology. We could look at a complete Mesos framework creation, but it is beyond the scope of this book.

The first release of Docker was on 03/13/2013 and version 1.12.0 was released on 07/14/2016. Both technologies are still new and promise good things.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Fast Data Processing Systems with SMACK Stack
Published in: Dec 2016Publisher: PacktISBN-13: 9781786467201
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Raúl Estrada

Raúl Estrada has been a programmer since 1996 and a Java developer since 2001. He loves all topics related to computer science. With more than 15 years of experience in high-availability and enterprise software, he has been designing and implementing architectures since 2003. His specialization is in systems integration, and he mainly participates in projects related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys web, mobile, and game programming. Raúl is a supporter of free software and enjoys experimenting with new technologies, frameworks, languages, and methods. Raúl is the author of other Packt Publishing titles, such as Fast Data Processing Systems with SMACK and Apache Kafka Cookbook.
Read more about Raúl Estrada