Instant Apache Solr for Indexing Data How-to [Instant]


This title is available as an eBook only
Instant Apache Solr for Indexing Data How-to [Instant]
eBook: $19.99
Formats: PDF, PacktLib, ePub and Mobi formats
$16.99
save 15%!
Print & eBook also available on:
Learn in an Instant - Short, Fast, Focused
Overview
Table of Contents
Author
Support
Sample Chapters
  • Learn something new in an Instant! A short, fast, focused guide delivering immediate results
  • Take the most basic schema and extend it to support multi-lingual, multi-field searches
  • Make Solr pull data from a variety of existing sources
  • Discover different pathways to acquire and normalize data and content

Book Details

Language : English
eBook : 90 pages
Release Date : June 2013
ISBN : 1782164847
ISBN 13 : 9781782164845
Author(s) : Alexandre Rafalovitch
Topics and Technologies : All Books, Big Data and Business Intelligence, Instant, Open Source

Table of Contents

Preface
Instant Apache Solr for Indexing Data How-to
  • Instant Apache Solr for Indexing Data How-to
    • Creating your first collection (Simple)
    • Running several collections at once (Simple)
    • Importing multivalued fields (Simple)
    • Using Solr's XML format (Simple)
    • Indexing text (Intermediate)
    • Indexing text – in depth (Advanced)
    • Indexing binary content on the server (Intermediate)
    • Pulling data from XML with DataImportHandler (Intermediate)
    • Pulling data from the database with DIH (Intermediate)
    • Commits and near real-time optimizations (Advanced)
    • Using the UpdateRequestProcessor plugins (Intermediate)
    • Client indexing with Java (Intermediate)
    • Atomic updates (Intermediate)
    • Indexing multiple languages (Advanced)

Alexandre Rafalovitch

Alexandre Rafalovitch is an IT professional with more than 20 years of experience. Throughout his career, he has worked as a software developer, as a QA engineer, in a senior tech support role, and as a web master. Alexandre has worked with Java, C#, Python, and even XQuery, building software and websites (both the backend and frontend components). He is familiar with the issues of processing and presenting multilingual content in many languages, including Russian, Chinese, and Arabic. Alexandre has developed several small open source projects of his own and has contributed to several more, including W3C Jigsaw and Apache Solr. He has published several industrial and academic publications and has presented at JavaOne twice. Alexandre is currently working for the United Nations; however, the views expressed herein are those of the author and do not necessarily reflect the views of the United Nations.
Sorry, we don't have any reviews for this title yet.

Code Downloads

Download the code and support files for this book.


Submit Errata

Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


Errata

- 2 submitted: last submission 09 Oct 2013

Errata type: Code | Page: 48 | Date: October 08, 2013

This code snippet

<entity name="vacancy-alerts"
dataSource="dbDS"
query="select * from ALERTS"
transformer="TemplateTransformer,DateFormatTransformer"
preImportDeleteQuery="type:vacancy-alert" >
<field name="id" column="ID" />
<field name="addr_from" column="FROM" />
<field name="addr_to" column="TO" />
<field name="subject" column="SUBJECT" />
<field name="message" column="MESSAGE" />
<field name="date" column="DATE" dateTimeFormat="
dd MMM yyyy"/>
<field column="type" template="vacancy-alert"/>
</entity>

Should be

<entity name="vacancy-alerts"
dataSource="dbDS"
query="select * from ALERTS"
transformer="TemplateTransformer,DateFormatTransformer"
preImportDeleteQuery="type:vacancy-alerts" >
<field name="id" column="ID" />
<field name="addr_from" column="FROM" />
<field name="addr_to" column="TO" />
<field name="subject" column="SUBJECT" />
<field name="message" column="MESSAGE" />
<field name="date" column="DATE" dateTimeFormat="
dd MMM yyyy"/>
<field column="type" template="vacancy-alerts"/>
</entity>

Errata type: Code | Page: 48 | Date: October 08, 2013

This line in Step 5.1

Run the Execute command from Admin WebUI, and make sure to check the Clean and Commit checkboxes and select vacancy-alert from the Entity dropdown

Should be

Run the Execute command from Admin WebUI, and make sure to check the Clean and Commit checkboxes and select vacancy-alerts from the Entity dropdown

 

Sorry, there are currently no downloads available for this title.

Frequently bought together

Instant Apache Solr for Indexing Data How-to [Instant] +    The Oracle Universal Content Management Handbook =
50% Off
the second eBook
Price for both: $40.60

Buy both these recommended eBooks together and get 50% off the cheapest eBook.

What you will learn from this book

  • Produce a basic Solr schema ready for experimentation and exploration
  • Run several collections on one Solr server
  • Import, search, and facet simple and multi-valued fields
  • Create your own field type analyzer chains for ultimate indexing flexibility
  • Detect, index, and partition multi-lingual content
  • Use CSV, XML, JSON, and binary formats to get data into Solr
  • Pull data from external files and databases using DataImportHandler
  • Write a Java client using the SolrJ library in both remote and embedded mode
  • Change data already indexed using atomic updates
  • Reshape incoming data with UpdateRequestProcessors
  • Control the visibility of data with soft and hard commits

In Detail

Content and data searching is a very important part of the modern user experience, and before something can be searched, it has to be indexed. Indexing is a hidden part of the process that has a surprisingly strong impact on the overall user experience. From speed, to faceting, to multilingual support, everything depends on correct indexing.

Instant Apache Solr for Indexing Data How-to is an example-driven guide that will take you on a journey from the basic collection of data to a multi-lingual, multi-field, multi-type schema. By the end of the book, you will know how to get your data ready for searches and how to tune the process to achieve the required search use-cases.

Instant Apache Solr for Indexing Data How-to is a friendly, practical guide that will show you how to index your data with Solr. This book will explain how Solr’s basic blocks actually work and fit together. You will then explore additional settings, pipelines, and configuration changes to achieve ever more complex goals. You will then cover how to push data into Solr and when to get Solr to pull the data. You will then master indexing textual and binary context before enabling multilingual content to be searched.

Approach

Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. This book is written in a friendly, practical manner with recipes covering important indexing techniques and methods using Apache Solr.

Who this book is for

This book is for developers who want to dive deeper into Solr. Regardless of whether you are just starting with Solr or have already built your first collection by copying and modifying examples, this book will take you through the complicated steps of indexing your data with Solr.

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software