In my last post I talked about how using a resource management platform can allow your Big Data workloads to be more efficient with less resources. In this post I want to continue the discussion with a specific resource management platform, which is Mesos.
Introduction to Mesos
Mesos is an Apache top-level project that provides an abstraction to your datacenter resources and an API to program against these resources to launch and manage your workloads. Mesos is able to manage your CPU, memory, disk, ports and other resources that the user can custom defines. Every application that wants to use resources in the datacenter to run tasks talks with Mesos is called a scheduler. It uses the scheduler API to receive resource offers and each scheduler can decide to use the offer, decline the offer to wait for future ones, or hold on the offer for a period of time to combine the resources. Mesos will ensure to provide fairness amongst multiple schedulers so no one scheduler can overtake all the resources.
So how does your Big data frameworks benefit specifically by using Mesos in your datacenter?
Autopilot your Big data frameworks
The first benefit of running your Big data frameworks on top of Mesos, which by abstracting away resources and providing an API to program against your datacenter, is that it allows each Big data framework to self-manage itself without minimal human intervention.
How does the Mesos scheduler API provide self management to frameworks? First we should understand a little bit more what does the scheduler API allows you to do.
The Mesos scheduler API provides a set of callbacks whenever the following events occurs: New resources available, task status changed, slave lost, executor lost, scheduler registered/disconnected, etc. By reacting to each event with the Big data framework's specific logic it allows frameworks to deploy, handle failures, scale and more.
Using Spark as an example, when a new Spark job is launched it launches a new scheduler waiting for resources from Mesos. When new resources are available it deploys Spark executors to these nodes automatically and provide Spark task information to these executors and communicate the results back to the scheduler. When some reason the task is terminated unexpectedly, the Spark scheduler receives the notification and can automatically relaunch that task on another node and attempt to resume the job. When the machine crashes, the Spark scheduler is also notified and can relaunch all the executors on that node to other available resources. Moreover, since the Spark scheduler can choose where to launch the tasks it can also choose the nodes that provides the most data locality to the data it is going to process. It can also choose to deploy the Spark executors in different racks to have more higher availability if it's a long running Spark streaming job.
As you can see, by programming against an API allows lots of flexibility and self-managment for the Big data frameworks, and saves a lot of manually scripting and automation that needs to happen.
Manage your resources among frameworks and users
When there are multiple Big data frameworks sharing the same cluster, and each framework is shared with multiple users, providing a good policy around ensuring the important users and jobs gets executed becomes very important.
Mesos allows you to specify roles, where multiple frameworks can belong to a role. Mesos then allows operators to specify weights among these roles, so that the fair share is enforced by Mesos to provide the resources according to the weight specified. For example, one might provide 70% resources to Spark and 30% resources to general tasks with the weighted roles in Mesos. Mesos also allows reserving a fixed amount of resources per agent to a specific role. This ensures that your important workload is guaranteed to have enough resources to complete its workload.
There are more features coming to Mesos that also helps multi-tenancy. One feature is called Quota where it ensures over the whole cluster that a certain amount of resources is reserved instead of per agent. Another feature is called dynamic reservation, which allows frameworks and operators to reserve a certain amount of resources at runtime and can unreserve them once it's no longer necessary.
Optimize your resources among frameworks
Using Mesos also boosts your utilization, by allowing multiple tasks from different frameworks to use the same cluster and boosts utilization without having separate clusters.
There are a number of features that are currently being worked on that will even boost the utilization even further.
The first feature is called oversubscription, which uses the tasks runtime statistics to estimate the amount of resources that is not being used by these tasks, and offers these resources to other schedulers so more resources is actually being utilized. The oversubscription controller also monitors the tasks to make sure when the task is being affected by sharing resources, it will kill these tasks so it's no longer being affected.
Another feature is called optimistic offers, which allows multiple frameworks to compete for resources. This helps utilization by allowing faster scheduling and allows the Mesos scheduler to have more inputs to choose how to best schedule its resources in the future.
As you can see Mesos allows your Big data frameworks to be self-managed, more efficient and allows optimizations that are only possible by sharing the same resource management. If you're curious how to get started you can follow at the Mesos website or Mesosphere website that provides even simpler tools to use your Mesos cluster.
About the author
Timothy Chen is a distributed systems engineer and entrepreneur. He works at Mesosphere and can be found on Github @tnachen.