Reader small image

You're reading from  Securing Hadoop

Product typeBook
Published inNov 2013
Reading LevelIntermediate
PublisherPackt
ISBN-139781783285259
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Sudheesh Narayan
Sudheesh Narayan
author image
Sudheesh Narayan

Sudheesh Narayanan is a Technology Strategist and Big Data Practitioner with expertise in technology consulting and implementing Big Data solutions. With over 15 years of IT experience in Information Management, Business Intelligence, Big Data & Analytics, and Cloud & J2EE application development, he provided his expertise in architecting, designing, and developing Big Data products, Cloud management platforms, and highly scalable platform services. His expertise in Big Data includes Hadoop and its ecosystem components, NoSQL databases (MongoDB, Cassandra, and HBase), Text Analytics (GATE and OpenNLP), Machine Learning (Mahout, Weka, and R), and Complex Event Processing. Sudheesh is currently working with Genpact as the Assistant Vice President and Chief Architect – Big Data, with focus on driving innovation and building Intellectual Property assets, frameworks, and solutions. Prior to Genpact, he was the co-inventor and Chief Architect of the Infosys BigDataEdge product.
Read more about Sudheesh Narayan

Right arrow

Automation of a secured Hadoop cluster deployment


Let us have a look at some of the most important tools.

Cloudera Manager

Cloudera Manager is another of the most popular Hadoop Management and Deployment Tool. Some of the key features of Cloudera Manager with respect to securing a Hadoop Cluster are:

  • Cloudera Manager automates the entire Hadoop cluster setup and enables an automated setup of a secure Hadoop cluster with Kerberos. Cloudera Manager automatically sets up the Keytab file in all the slave nodes, and updates the Hadoop configuration with the required Keytab locations and service principal details. Cloudera Manager updates the configuration files (core-site.xml, hdfs-site.xml, mapred-site.xml, oozie-site.xml, hue.ini, and taskcontroller.cfg) without any manual intervention.

  • It supports the deployment of a role-based administration, where there are read-only administrators who monitor the cluster while others can change the deployments.

  • It enables administrators to configure alerts specific to user activity and access. This can be leveraged to security incidents and event monitoring.

  • Cloudera can send events to enterprise SIEM tools about security incidents in Hadoop using SNMP.

  • It can integrate user credentials using LDAP with Active Directory.

    Note

    More details on Cloudera Manager are available at the following URL: http://www.cloudera.com/content/cloudera/en/products/cloudera-manager.html.

Zettaset

Zettaset (http://www.zettaset.com/) provides a product Zettaset Orchestrator that provides seamless secured Hadoop deployment and management. Zettaset doesn't provide any Hadoop distribution, but works with all distributions such as Cloudera, Hortonworks, and Apache Hadoop. Some of the key features of the Zettaset Orchestrator are:

  • It provides an automated deployment of a secured Hadoop cluster

  • It hardens the entire Hadoop deployment from an enterprise perspective to address policy, compliance, access control, and risk management within the Hadoop cluster environment

  • It integrates seamlessly with an existing enterprise security policy framework using LDAP and Active Directory (AD)

  • It provides centralized configuration management, logging, and auditing

  • It provides role-based access controls (RBACs) and enables Kerberos to be seamlessly integrated with the rest of the ecosystem

All other platform management tools such as Ambari and Greenplum Hadoop Deployment Manager need manual setup for establishing a secured Hadoop cluster. The Keytab files, service principals, and the configuration files have to be manually deployed on all nodes.

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Securing Hadoop
Published in: Nov 2013Publisher: PacktISBN-13: 9781783285259

Author (1)

author image
Sudheesh Narayan

Sudheesh Narayanan is a Technology Strategist and Big Data Practitioner with expertise in technology consulting and implementing Big Data solutions. With over 15 years of IT experience in Information Management, Business Intelligence, Big Data & Analytics, and Cloud & J2EE application development, he provided his expertise in architecting, designing, and developing Big Data products, Cloud management platforms, and highly scalable platform services. His expertise in Big Data includes Hadoop and its ecosystem components, NoSQL databases (MongoDB, Cassandra, and HBase), Text Analytics (GATE and OpenNLP), Machine Learning (Mahout, Weka, and R), and Complex Event Processing. Sudheesh is currently working with Genpact as the Assistant Vice President and Chief Architect – Big Data, with focus on driving innovation and building Intellectual Property assets, frameworks, and solutions. Prior to Genpact, he was the co-inventor and Chief Architect of the Infosys BigDataEdge product.
Read more about Sudheesh Narayan