Reader small image

You're reading from  Scaling Big Data with Hadoop and Solr, Second Edition

Product typeBook
Published inApr 2015
Publisher
ISBN-139781783553396
Edition1st Edition
Concepts
Right arrow
Author (1)
Hrishikesh Vijay Karambelkar
Hrishikesh Vijay Karambelkar
author image
Hrishikesh Vijay Karambelkar

Hrishikesh Vijay Karambelkar is an innovator and an enterprise architect with 16 years of software design and development experience, specifically in the areas of big data, enterprise search, data analytics, text mining, and databases. He is passionate about architecting new software implementations for the next generation of software solutions for various industries, including oil and gas, chemicals, manufacturing, utilities, healthcare, and government infrastructure. In the past, he has authored three books for Packt Publishing: two editions of Scaling Big Data with Hadoop and Solr and one of Scaling Apache Solr. He has also worked with graph databases, and some of his work has been published at international conferences such as VLDB and ICDE.
Read more about Hrishikesh Vijay Karambelkar

Right arrow

Using Solr 1045 Patch – map-side indexing


Apache Solr 1045 patch provides Solr users a way to build Solr indexes using the MapReduce framework of Apache Hadoop. Once created, this index can be pushed to Solr storage. The following diagram depicts the Mapper and Reducer in Hadoop:

Each Apache Hadoop mapper transforms the input records into a set of (key, value) pairs, which then get transformed into SolrInputDocument. The Mapper task then ends up creating an index from SolrInputDocument.

The focus of Reducer is to perform de-duplication of different indexes and merge them if needed. Once the indexes are created, you can load them on your Solr instance and use them for searching. You can read more about this patch at https://issues.apache.org/jira/browse/SOLR-1045.

The patch follows the standard process of patching up your label through svn (Subversion). To apply a patch to your Solr instance, first, you need to build your Solr instance using source. The instance should be supported by Solr...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Scaling Big Data with Hadoop and Solr, Second Edition
Published in: Apr 2015Publisher: ISBN-13: 9781783553396

Author (1)

author image
Hrishikesh Vijay Karambelkar

Hrishikesh Vijay Karambelkar is an innovator and an enterprise architect with 16 years of software design and development experience, specifically in the areas of big data, enterprise search, data analytics, text mining, and databases. He is passionate about architecting new software implementations for the next generation of software solutions for various industries, including oil and gas, chemicals, manufacturing, utilities, healthcare, and government infrastructure. In the past, he has authored three books for Packt Publishing: two editions of Scaling Big Data with Hadoop and Solr and one of Scaling Apache Solr. He has also worked with graph databases, and some of his work has been published at international conferences such as VLDB and ICDE.
Read more about Hrishikesh Vijay Karambelkar