Index
A
- ACLs
- configuring / Job queue ACLs, How to do it..., How it works...
- Apache HBase
- about / Introduction
- Apache Hive / Introduction, Hive server modes and setup
- Application master (AM) / How to do it...
- Application Master (AM) / Running a simple MapReduce program
- application master launcher (AM Launcher) / Running a simple MapReduce program
- ApplicationMasters (AM) / Preserving ResourceManager states
- ApplicationsManager (AsM) / Configuring ResourceManager components
- auditing, Hadoop
- configuration / Configuring auditing, How to do it..., How it works...
- authentication server (AS) / Configuring Kerberos server
B
- Bigtop
- URL / Installation methods
- block reports
- bucketing
C
- Capacity Scheduler
- configuring / Configuring Capacity Scheduler, How to do it...
- about / Getting ready
- job queue, mappings / Queuing mappings in Capacity Scheduler, Getting ready, How to do it...
- cluster
- nodes, adding / Adding nodes to the cluster, How to do it..., There's more...
- nodes. required / Nodes needed in the cluster, How to do it..., How it works...
- sizing, as per SLA / Sizing the cluster as per SLA, How to do it...
- Cryptographic Extension (JCE) / How it works...
D
- data
- loading, in HDFS / Loading data in HDFS, How it works...
- Datanode
- tuning / Tuning Datanode, Getting ready, How to do it..., How it works...
- troubleshooting / Datanode troubleshooting, How to do it...
- Datanode heartbeat
- configuring / Configuring Datanode heartbeat, How it works...
- Datanodes
- recovering, when disk is full / Datanode recovery – disk full, How it works...
- deleted files
- recovering / Recovering deleted files, How to do it...
- Derby database / How it works...
- disk
- encrypting, Luks used / Encrypting disk using LUKS, How to do it..., How it works...
- disk drives
- tuning / Tuning the disk, How to do it...
- disk space
- calculation / Disk space calculations, How to do it..., How it works...
- distcp
- usage / Distcp usage, How it works...
E
- edge node
- Edge nodes
- about / Configuring HA for Edge nodes
- HA, configuring / Configuring HA for Edge nodes, How it works...
- errors
- logs, parsing / Parse logs for errors, How to do it..., How it works...
F
- Fair Scheduler
- configuring / Getting ready, How to do it...
- configuring, with pools / Fair Scheduler pools, How to do it...
- filer
- about / Namenode HA using shared storage
- Flume
- configuring / Configuring Flume, How to do it..., How it works...
- FSCK / HDFS health and FSCK, Getting ready, How to do it...
G
- Google File system(GFS)
- about / Overview of HDFS
- groups
- configuring / Configuring users and groups, How to do it...
H
- Hadoop
- about / Introduction, Overview of Hadoop Architecture
- building / Building and compiling Hadoop, Getting ready, How it works...
- compiling / Building and compiling Hadoop, Getting ready, How it works...
- reference link / Building and compiling Hadoop
- installation methods / Installation methods, How it works...
- best practices / Hadoop best practices, How it works...
- users, configuring / Configuring Hadoop users, How to do it...
- SSL, configuring / Configuring SSL in Hadoop, How to do it..., How it works...
- Kerberos, enabling / Configuring and enabling Kerberos for Hadoop, How to do it..., How it works...
- Kerberos, configuring / Configuring and enabling Kerberos for Hadoop, How to do it..., How it works...
- Hadoop Architecture
- about / Overview of Hadoop Architecture
- Hadoop cluster
- reference link / Introduction
- benchmarking / Benchmarking Hadoop cluster, There's more..., How it works...
- cost, estimating / Estimating the cost of the Hadoop cluster, How to do it...
- hardware options / Hardware and software options
- software options / Hardware and software options
- Hadoop directory, permissions
- reference link / Introduction
- Hadoop distributed file system (HDFS)
- about / Overview of Hadoop Architecture, Overview of HDFS
- data, loading / Loading data in HDFS, How it works...
- replication, configuring / Configuring HDFS replication, How to do it..., How it works...
- health, verifying / HDFS health and FSCK, Getting ready, How to do it...
- Hadoop Gateway node
- configuring / Configuring the Hadoop Gateway node, How to do it...
- about / Configuring the Hadoop Gateway node
- Hadoop streaming / Hadoop streaming, How to do it...
- HBase
- single node cluster, setting up / Setting up single node HBase cluster, How to do it...
- components, setting up / Setting up single node HBase cluster, How to do it..., How it works...
- multi-node cluster, setting up / Setting up multi-node HBase cluster, How to do it..., How it works...
- data, inserting into / Inserting data into HBase, How to do it...
- Hive, integrating with / Integration with Hive, How to do it..., How it works...
- administration commands / HBase administration commands, How to do it...
- backup / HBase backup and restore, How to do it..., How it works...
- restoring / HBase backup and restore, How to do it..., How it works...
- tuning / Tuning HBase, How to do it..., How it works...
- upgrading / HBase upgrade, How to do it...
- data, migrating from MYSQL Sqoop used / Migrating data from MySQL to HBase using Sqoop, How to do it...
- troubleshooting / HBase troubleshooting, How to do it...
- HDFS
- serving, for NFS gateway configuration / Configuring NFS gateway to serve HDFS, How to do it...
- tuning / Tuning HDFS, How to do it...
- about / How it works...
- testing, with TestDFSIO / Benchmark 1--Testing HDFS with TestDFSIO
- encryption, at rest / HDFS encryption at Rest, How to do it..., How it works...
- HDFS balancer
- about / HDFS balancer, How it works...
- HDFS block size
- configuring / Getting ready, How it works...
- HDFS cache
- configuring / Configure HDFS cache, How to do it..., How it works...
- HDFS Image Viewer
- HDFS logs
- configuring / Configuring HDFS and YARN logs, How to do it..., How it works...
- HDFS snapshots
- about / HDFS snapshots, How to do it..., How it works...
- High Availability (HA)
- configuring, for Edge nodes / Configuring HA for Edge nodes, How it works...
- high availability (HA)
- upgrading / Rolling upgrade with HA, Getting ready, How it works...
- Hive
- operating, with ZooKeeper / Operating Hive with ZooKeeper, How to do it..., How it works...
- data, loading / Loading data into Hive, How to do it...
- partitioning / Partitioning and Bucketing in Hive, How to do it...
- bucketing / Partitioning and Bucketing in Hive, How to do it...
- metastore database / Hive metastore database, How to do it..., How it works...
- designing, with credential store / Designing Hive with credential store, How to do it...
- tuning, for performance / Hive performance tuning, How to do it..., There's more..., How it works...
- integrating, with HBase / Integration with Hive, How to do it..., How it works...
- troubleshooting / Hive troubleshooting, How to do it..., How it works...
- Hive metastore
- MySQL, using / Using MySQL for Hive metastore, How to do it…, How it works...
- Hive server
- host resolution
I
- in-transit encryption
- configuring / In-transit encryption, How to do it...
- insert with overwrite operation / How to do it...
- installation methods, Hadoop / Installation methods, How it works...
J
- JMX metrics / ResourceManager Web UI and JMX metrics, How to do it..., How it works...
- job history
- exploring, Web UI used / Job history web interface and metrics, How to do it..., How it works...
- job queues
- configuring / Configuring job queues, How to do it..., How it works...
- mappings, in Capacity Scheduler / Queuing mappings in Capacity Scheduler, Getting ready, How to do it...
- Journal node
- used, for Namenode High Availability (HA) / Namenode HA using Journal node, How to do it...
- Just a bunk of disks (JBOD) / Tuning the operating system
K
- Kerberos
- configuring, for Hadoop / Configuring and enabling Kerberos for Hadoop, Getting ready, How to do it..., How it works...
- enabling, for Hadoop / Configuring and enabling Kerberos for Hadoop, How to do it..., How it works...
- Kerberos server
- configuring / Configuring Kerberos server, Getting ready, How to do it...
- Key distribution center (KDC, subcomponent KGS) / Configuring Kerberos server
- Key Management Server (KMS) / HDFS encryption at Rest, How it works...
L
- local_policy.jar
- reference link / How it works...
- logs
- parsing, for errors / Parse logs for errors, How to do it..., How it works...
- Luks
- used, for encrypting disk / Encrypting disk using LUKS, How to do it..., How it works...
M
- Mapred commands / YARN and Mapred commands, How to do it...
- MapReduce
- configuring, for performance / Configuring MapReduce for performance, How to do it...
- testing, by generation of small files / Benchmark 3--MapReduce testing by generating small files
- MapReduce program
- executing / Running a simple MapReduce program, Getting ready, How to do it...
- map_scripts
- reference link / How it works...
- memory
- requisites / Memory requirements, How to do it...
- requisites, per Datanode / How to do it...
- Memstore Flush / How it works...
- modes, Hive server
- standalone / Hive server modes and setup
- local metastore / Hive server modes and setup
- remote metastore / Hive server modes and setup
- multi-node cluster
- installing / Installing a multi-node cluster, Getting ready, How to do it..., How it works...
- MySQL
- using, for Hive metastore / Using MySQL for Hive metastore, How to do it…, How it works...
- URL, for downloading / How to do it…
- data, migrating to HBase Sqoop used / Migrating data from MySQL to HBase using Sqoop, How to do it...
N
- Namenode
- saveNamespace, initiating / Initiating Namenode saveNamespace, How to do it...
- recovering / Backing up and recovering Namenode, How to do it..., How it works..., Namenode recovery, How to do it...
- backing up / Backing up and recovering Namenode, How to do it..., How it works...
- roll edits in Online mode / Namenode roll edits – online mode, How to do it...
- roll edits in Offline mode / Namenode roll edits – offline mode, How to do it...
- tuning / Tuning Namenode, Getting ready, How to do it...
- stress testing / Benchmark 2--Stress testing Namenode
- troubleshooting / Namenode troubleshooting, How to do it...
- Namenode High Availability (HA)
- shared storage, used / Namenode HA using shared storage, Getting ready, How to do it..., How it works...
- about / Namenode HA using shared storage
- Journal node, used / Namenode HA using Journal node, How to do it...
- Namenode metadata location
- Namespace identifier (namespaceID) / How it works...
- network
- tuning / Tuning the network, How to do it...
- network design
- for Hadoop cluster / Network design, How to do it...
- NFS gateway
- configuring, to serve HDFS / Configuring NFS gateway to serve HDFS, How to do it...
- NodeManager
- setting up / Setting up ResourceManager and NodeManager
- NodeManagers / Running a simple MapReduce program
- nodes
- decommissioning / Decommissioning nodes, How to do it..., How it works...
- adding, to cluster / Adding nodes to the cluster, How to do it..., There's more...
- required, in cluster / Nodes needed in the cluster, How to do it..., How it works...
- communication issues, troubleshooting / Diagnose communication issues, How to do it..., How it works...
O
- Oozie
- workflow engine, configuring / Configure Oozie and workflows, How to do it..., How it works...
- about / Configure Oozie and workflows
- operating system
P
- PAM / Designing Hive with credential store
- parameters
- obtaining which are in-effect / Fetching parameters which are in-effect, How to do it...
- partitioning
- Primary Namenode
- Secondary Namenode, promoting to / Promoting Secondary Namenode to Primary, Getting ready, How it works...
- principal / Configuring Kerberos server
- pseudo-distributed cluster / How it works...
- Puppet / Getting ready
Q
- Quorum Journal Manager (QJM)
- about / How it works...
- reference link / How it works...
R
- rack awareness
- configuring / Configuring rack awareness, How to do it..., How it works...
- recycle bin
- configuring / Recycle or trash bin configuration, How it works...
- resource allocations / YARN containers and resource allocations, How to do it..., How it works..., There's more...
- ResourceManager
- setting up / Setting up ResourceManager and NodeManager
- Resourcemanager
- troubleshooting / Resourcemanager troubleshooting, How it works...
- ResourceManager (RM)
- about / Running a simple MapReduce program
- components, configuring / Configuring ResourceManager components, How to do it..., How it works...
- states, preserving / Preserving ResourceManager states, Getting ready, How it works...
- Resourcemanager (RM)
- Resourcemanager HA
- ZooKeeper, used / Resourcemanager HA using ZooKeeper, How to do it...
- ResourceManager Web UI / ResourceManager Web UI and JMX metrics, How to do it..., How it works...
S
- Safe mode / How to do it...
- Secondary Namenode
- configuring / Configuring Secondary Namenode, How to do it..., How it works…
- about / Configuring Secondary Namenode
- promoting, to Primary Namenode / Promoting Secondary Namenode to Primary, Getting ready, How it works...
- service level authorization
- shared cache manager
- configuring / Configure shared cache manager, How to do it...
- shared storage
- used, for Namenode High Availability (HA) / Namenode HA using shared storage, Getting ready, How to do it..., How it works...
- single-node cluster, HDFS components
- single-node cluster, YARN components
- SLA
- cluster, sizing / Sizing the cluster as per SLA, How to do it...
- Sqoop
- used, for data migrating from MySQL TO HBase / Migrating data from MySQL to HBase using Sqoop, How to do it...
- SSL
- configuring, in Hadoop / Configuring SSL in Hadoop, How to do it..., How it works...
- storage based policies
- configuring / Configuring storage based policies, How to do it...
T
- TCP/IP connection / How it works...
- TeraGen benchmarks / Benchmark 4--TeraGen, TeraSort, and TeraValidate benchmarks
- TeraSort benchmarks / Benchmark 4--TeraGen, TeraSort, and TeraValidate benchmarks
- TeraValidate benchmarks / Benchmark 4--TeraGen, TeraSort, and TeraValidate benchmarks
- TestDFSIO
- HDFS, testing / Benchmark 1--Testing HDFS with TestDFSIO
- ticket (TGT) / Configuring Kerberos server
- Timeline server / There's more...
- total cost of ownership (TCO) / Hardware and software options
- transparent huge pages (THP) / How to do it...
- trash bin
- configuring / Recycle or trash bin configuration, How it works...
U
- users
- configuring / Configuring users and groups, How to do it...
- US_export_policy.jar
- reference link / How it works...
W
- WAL size / How it works...
- Web UI
- used, for exploring YARN metrics / Job history web interface and metrics, How to do it..., How it works...
- used, for exploring job history / Job history web interface and metrics, How to do it..., How it works...
Y
- YARN
- configuring, for performance / Configuring YARN for performance, Getting ready, How to do it..., How it works...
- YARN commands / YARN and Mapred commands, How to do it...
- YARN components / There's more...
- YARN containers
- YARN history server
- configuring / Configuring YARN history server, How it works...
- YARN label-based scheduling
- configuring / YARN label-based scheduling, How to do it..., How it works...
- YARN logs
- configuring / Configuring HDFS and YARN logs, How to do it...
- YARN metrics
- exploring, Web UI used / Job history web interface and metrics, How to do it..., How it works...
- YARN Scheduler Load Simulator (SLS) / YARN SLS, How to do it..., How it works...
- Yet Another Resource Negotiator (YARN)
- about / Overview of Hadoop Architecture
- Yet another Resource Negotiator (YARN) / Running a simple MapReduce program
Z
- ZooKeeper
- configuration / ZooKeeper configuration, How to do it...
- reference link / How to do it...
- used, for Resourcemanager HA / Resourcemanager HA using ZooKeeper, How to do it...
- Hive, operating / Operating Hive with ZooKeeper, How to do it..., How it works...
- securing / Securing ZooKeeper, How to do it..., How it works...
- ZooKeeper failover controller (ZKFC) / Namenode HA using Journal node