You're reading from Fast Data Processing Systems with SMACK Stack
As we saw in the Chapter 6, The Manager - Apache Mesos, a Mesos framework is a layer between Mesos and the application, and is used for managing task scheduling and execution. As the framework implementation is specific to the application, the term is usually used to refer to the application.
Initially, the Mesos framework only could communicate with the Mesos API using the libmesos
written on C++. So, in order to interact with Java, Scala, Python, and Go, one had to develop language bindings using libmesos
. Since Mesos version 0.19.0, an HTTP-based protocol has been included to enable developers to build frameworks in the language of their choice without having to work with a C++ wrapper.
A Mesos framework has two components:
- Scheduler: Responsible for taking decisions on resource offerings and tracking the current cluster state. The
SchedulerDriver
module has these responsibilities: to handle the communication with the Mesos master, to register frameworks with the master...
Spark can run over Mesos in two modes: coarse-grained (default) and fine-grained (deprecated).
In coarse-grained mode, each Spark executor runs as a single Mesos task. Spark executors are sized according to the following configuration variables:
- Executor memory:
spark.executor.memory
- Executor cores:
spark.executor.cores
- Number of executors:
spark.cores.max/spark.executor.cores
Executors are brought up when the application starts, until spark.cores.max
is reached. If spark.cores.max
is not set, the Spark application will reserve all resources offered to it by Mesos, so it is highly recommended to set this variable in any sort of multi-tenant cluster, including those running multiple concurrent Spark applications.
The scheduler will start the executors round-robin on the offers Mesos gives it, but there are no spread guarantees, as Mesos does not provide such guarantees on the offer stream.
The benefit of the coarse-grained mode is a much lower startup overhead...
Mesos has an API so that developers can build custom frameworks to run on top of the infrastructure. As you can imagine, Mesos implements the actor model for message passing, because the complexity increases without a non-blocking communication. This also leverages protocol buffers.
Support for the new HTTP API was introduced in the Mesos version 0.24. The Mesos Master has the /api/v1/scheduler
endpoint which the scheduler communicates.
For detailed information about the scheduler HTTP API, go to:
http://mesos.apache.org/documentation/latest/scheduler-http-api/
The master accepts the following request calls.
To enable communication with the master, the scheduler sends a SUBSCRIBE
message via HTTP POST
. The response contains the subscription confirmation and the framework ID to continue the conversation.
The SUBSCRIBE
JSON request structure is:
POST /api/v1/scheduler HTTP/1.1 { "type" : "SUBSCRIBE", "force" : true, ...
In this section we'll learn about containers and Docker. We will expose an overview of the options for Mesos containerizers. We'll also discuss advanced topics such as networking and cache.
A Linux container, for simplicity called simply container, is an environment to run applications with a common share of resources. As all the containers share the host machine operating system, their creation is very fast.
Container technology based on operating system virtualization has been around since 2004. In OS level virtualization the OS kernel creates many user containers.
The containers are individual encapsulated components running isolated instances on the same kernel. It's easy for the developer to just package the application and its dependencies into a container and deploy it in another environment.
In today's DevOps era, our portable applications are easy to update and move between environments. If we have a desktop development environment, the move to the production...
We can define Docker as an open source platform that automates application deployment. We define a container as a portable, lightweight, auto sufficient module to be deployed on a container platform.
Docker is based on the Linux Container (LXC) model. Docker is not just an underlying technology; it acts as an abstraction layer to package and containerize applications and dependencies. Imagine Docker containers as shipping containers for applications and dependencies.
Docker has three main benefits:
- Agility: The application development is done quickly and efficiently, because the IT Ops department also has flexible interaction
- Control: We have a version control, to ensure code authorship
- Portability: We can move from a single developer environment to a cloud environment immediately
Docker is a DevOps tool, reconciling the Development team and the IT Operations team:
- Development team: Concerned with what is shipped on the container: code, data, apps, libraries
- IT Operations:...
In this study case, we've covered the Mesos API for framework development. We've also studied how to use Mesos and Docker containerizers.
The Mesos framework version 1.0.0 was released on 07/29/2016 so it's a very new technology. We could look at a complete Mesos framework creation, but it is beyond the scope of this book.
The first release of Docker was on 03/13/2013 and version 1.12.0 was released on 07/14/2016. Both technologies are still new and promise good things.