Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Securing Hadoop

You're reading from  Securing Hadoop

Product type Book
Published in Nov 2013
Publisher Packt
ISBN-13 9781783285259
Pages 116 pages
Edition 1st Edition
Languages
Author (1):
Sudheesh Narayan Sudheesh Narayan
Profile icon Sudheesh Narayan

Mapping of security technologies with the reference architecture


We looked at the various commercial and open source tools that enable securing the Big Data platform. This section provides the mapping of these various technologies and how they fit into the overall reference architecture.

Infrastructure security

Physical security needs to be enforced manually. However, unauthorized access to a distributed cluster is avoided by deploying Kerberos security in the cluster. Kerberos ensures that the services and users confirm their identity with the KDC before they are provided access to the infrastructure services. Project Rhino aims to extend this further by providing the token-based authentication framework.

OS and filesystem security

Filesystem security is enforced by providing a secured virtualization layer on the existing OS filesystem using the file encryption technique. Files written to the disk are encrypted and while files read from the file are decrypted on-the-fly. These features are provided by eCryptfs and zNcrypt tools. SELinux also provides significant protection by hardening the OS.

Application security

Tools such as Sentry and HUE provide a platform for secured access to Hadoop. They integrate with LDAP to provide seamless enterprise integration.

Network perimeter security

One of the common techniques to ensure perimeter security in Hadoop is by isolation of the Hadoop cluster from the rest of the enterprise. However, users still need to access the cluster with tools such as Knox and HttpFS , that provide the proxy layer for end users to remotely connect to the Hadoop cluster and submit jobs and access the filesystem.

Data masking and encryption

To protect data in motion and at rest, encryption and masking techniques are deployed. Tools such as IBM Optim and Dataguise provide large scale data masking for enterprise data. To protect data in REST in Hadoop, we deploy block-level encryption in Hadoop. Intel's distribution supports the encryption and compression of files. Project Rhino enables block-level encryption similar to Dataguise and Gazzang.

Authentication and authorization

While authentication and authorization has matured significantly, tools such as Zettaset Orchestrator and Project Rhino enable integration with the enterprise system for authentication and authorization.

Audit logging, security policies, and procedures

Common Security Audit logging for user access to Hadoop Cluster is enabled by tools such as Cloudera Manager. Cloudera Manager also has the ability to generate alerts and events based on the configured organizational policies. Similarly, Intel's manager and Zettaset Orchestrator also provide the security policies enforcement in the cluster as per organizational policies.

Security Incident and Event Monitoring

Detecting security incident and monitoring events in a Big Data platform is essential. Open source tools such as OSSEC and IBM Gaudium enable a secured Hadoop cluster to detect security incidents and provide easy integration with enterprise SIEM tools.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}