Packt+ | Advance your knowledge in tech

You're reading from Scaling Big Data with Hadoop and Solr, Second Edition

Product type Book

Published in Apr 2015

Publisher

ISBN-13 9781783553396

Pages 166 pages

Edition 1st Edition

Languages

Concepts

Big Data

Author (1):

Hrishikesh Vijay Karambelkar

Table of Contents (13) Chapters

Scaling Big Data with Hadoop and Solr Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

1. Processing Big Data Using Hadoop and MapReduce

2. Understanding Apache Solr

3. Enabling Distributed Search using Apache Solr

4. Big Data Search Using Hadoop and Its Ecosystem

5. Scaling Search Performance

Use Cases for Big Data Search

Index

Common problems and their solutions

The following is a list of common problems and their solutions:

When I try to format the HDFS node, I get the exception java.io.IOException: Incompatible clusterIDs in namenode and datanode?
This issue usually appears if you have a different/older cluster and you are trying to format a new namenode; however, the datanodes still point to older cluster ids. This can be handled by one of the following:
1. By deleting the DFS data folder, you can find the location from hdfs-site.xml and restart the cluster
2. By modifying the version file of HDFS usually located at <HDFS-STORAGE-PATH>/hdfs/datanode/current/
3. By formatting namenode with the problematic datanode's cluster ID:
```
  $ hdfs namenode -format -clusterId <cluster-id>
```
My Hadoop instance is not starting up with the ./start-all.sh script? When I try to access the web application, it shows the page not found error?
This could be happening because of a number of issues. To understand the issue, you must look at the Hadoop logs first. Typically, Hadoop logs can be accessed from the /var/log folder if the precompiled binaries are installed as the root user. Otherwise, they are available inside the Hadoop installation folder.
I have setup N node clusters, and I am running the Hadoop cluster with ./start-all.sh. I am not seeing many nodes in the YARN/NameNode web application?
This again can be happening due to multiple reasons. You need to verify the following:
1. Can you reach (connect to) each of the cluster nodes from namenode by using the IP address/machine name? If not, you need to have an entry in the /etc/hosts file.
2. Is the ssh login working without password? If not, you need to put the authorization keys in place to ensure logins without password.
3. Is datanode/nodemanager running on each of the nodes, and can you connect to namenode/AM? You can validate this by running ssh on the node running namenode/AM.
4. If all these are working fine, you need to check the logs and see if there are any exceptions as explained in the previous question.
5. Based on the log errors/exceptions, specific action has to be taken.

You're reading from Scaling Big Data with Hadoop and Solr, Second Edition

Table of Contents (13) Chapters

Common problems and their solutions

Authors (1)

Personalised recommendations for you