Apache Solr Enterprise Search Server - Third Edition

Enhance your searches with faceted navigation, result highlighting, relevancy-ranked sorting, and much more with this comprehensive guide to Apache Solr 4

Apache Solr Enterprise Search Server - Third Edition

Starting
David Smiley et al.

Enhance your searches with faceted navigation, result highlighting, relevancy-ranked sorting, and much more with this comprehensive guide to Apache Solr 4
$35.99
$44.99
RRP $35.99
RRP $44.99
eBook
Print + eBook
$12.99 p/month

Want this title & more? Subscribe to PacktLib

Enjoy full and instant access to over 2000 books and videos – you’ll find everything you need to stay ahead of the curve and make sure you can always get the job done.
+ Collection
Free Sample

Book Details

ISBN 139781782161363
Paperback432 pages

About This Book

  • An update to the popular second edition of Apache Solr 3 Enterprise Search Server, covering Solr 4’s most important new features such as SolrCloud for scaling and real-time search
  • Contains integration examples with databases, web crawlers, Hadoop, XSLT, Java and embedded Solr, PHP and Drupal, JavaScript, and Ruby frameworks
  • Learn about deployment considerations including security, logging, monitoring, running ZooKeeper, and measuring performance

Who This Book Is For

This book is for developers who want to learn how to get the most out of Solr in their applications, whether you are new to the field, have used Solr but don't know everything, or simply want a good reference. It would be helpful to have some familiarity with basic programming concepts, but no prior experience is required.

Table of Contents

Chapter 1: Quick Starting Solr
An introduction to Solr
A few differences between Solr 4 and Solr 5
Getting started
A quick tour of Solr
Configuration files
What's next?
Resources outside this book
Summary
Chapter 2: Schema Design
Is Solr schemaless?
MusicBrainz.org
One combined index or separate indices
Schema design
The schema.xml file
Summary
Chapter 3: Text Analysis
Configuring field types
Character filters
Tokenization
Filtering
The multilingual search
Summary
Chapter 4: Indexing Data
Communicating with Solr
Solr's Update-XML format
Commit, optimize, and rollback the transaction log
Atomic updates and optimistic concurrency
Sending CSV-formatted data to Solr
The DataImportHandler framework
Indexing documents with Solr Cell
Update request processors
Summary
Chapter 5: Searching
Your first search – a walk-through
Solr's generic XML structured data representation
Solr's XML response format
Understanding request handlers
Query parameters
Query parsers and local-params
Query syntax (the lucene query parser)
The DisMax query parser – part 1
Filtering
Sorting
Joining
Spatial search
Summary
Chapter 6: Search Relevancy
Scoring
The DisMax query parser – part 2
Functions and function queries
Summary
Chapter 7: Faceting
A quick example – faceting release types
Field requirements
Types of faceting
Faceting field values
Faceting numeric and date ranges
Facet queries
Building a filter query from a facet
Pivot faceting
Excluding filters – multiselect faceting
Summary
Chapter 8: Search Components
About components
The highlight component
The SpellCheck component
Query complete/suggest
The QueryElevation component
The MoreLikeThis component
The Stats component
The Clustering component
Collapsing and expanding
The TermVector component
Summary
Chapter 9: Integrating Solr
Working with the included examples
Solritas – the integrated search UI
SolrJ – Solr's Java client API
Using JavaScript/AJAX with Solr
Using XSLT to transform XML search results
Accessing Solr from PHP applications
Ruby on Rails integrations
Nutch for crawling web pages
Solr and Hadoop
ManifoldCF – a connector framework
Document-level security
Summary
Chapter 10: Scaling Solr
Tuning complex systems is hard
Use SolrMeter to test Solr performance
Optimizing a single Solr server – scale up
Configuring Solr for near real-time search
Use SolrCloud to go big – scale wide
Summary
Chapter 11: Deployment
Deployment methodology for Solr
Installing Solr into a Servlet container
Configuring logging
A RequestHandler per search interface
Leveraging Solr cores
Setting up ZooKeeper for SolrCloud
Monitoring Solr performance
Securing Solr from prying eyes
Summary

What You Will Learn

  • Design a schema to include text indexing details such as tokenization, stemming, and synonyms
  • Import data from databases using various formats including CSV and XML and extract text from different document formats
  • Search using Solr's rich query syntax, perform geospatial searches, "join" relationally, and influence relevancy order
  • Build a query auto-complete/suggester capability with knowledge of the fundamental types of suggestion and ways to implement them
  • Enhance standard searches with faceting for navigation or analytics
  • Deploy Solr to production taking into account logging, security, and monitoring
  • Integrate a host of technologies with Solr including web crawlers, Hadoop, Java, JavaScript, Ruby, PHP, Drupal, and others
  • Tune Solr and use SolrCloud for horizontal scalability

In Detail

Solr is a widely popular open source enterprise search server that delivers powerful search and faceted navigation features—features that are elusive with databases. Solr supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, relevancy tuning, geospatial searches, and much more.

This book is a comprehensive resource for just about everything Solr has to offer, and it will take you from first exposure to development and deployment in no time. Even if you wish to use Solr 5, you should find the information to be just as applicable due to Solr's high regard for backward compatibility. The book includes some useful information specific to Solr 5.

Authors

Table of Contents

Chapter 1: Quick Starting Solr
An introduction to Solr
A few differences between Solr 4 and Solr 5
Getting started
A quick tour of Solr
Configuration files
What's next?
Resources outside this book
Summary
Chapter 2: Schema Design
Is Solr schemaless?
MusicBrainz.org
One combined index or separate indices
Schema design
The schema.xml file
Summary
Chapter 3: Text Analysis
Configuring field types
Character filters
Tokenization
Filtering
The multilingual search
Summary
Chapter 4: Indexing Data
Communicating with Solr
Solr's Update-XML format
Commit, optimize, and rollback the transaction log
Atomic updates and optimistic concurrency
Sending CSV-formatted data to Solr
The DataImportHandler framework
Indexing documents with Solr Cell
Update request processors
Summary
Chapter 5: Searching
Your first search – a walk-through
Solr's generic XML structured data representation
Solr's XML response format
Understanding request handlers
Query parameters
Query parsers and local-params
Query syntax (the lucene query parser)
The DisMax query parser – part 1
Filtering
Sorting
Joining
Spatial search
Summary
Chapter 6: Search Relevancy
Scoring
The DisMax query parser – part 2
Functions and function queries
Summary
Chapter 7: Faceting
A quick example – faceting release types
Field requirements
Types of faceting
Faceting field values
Faceting numeric and date ranges
Facet queries
Building a filter query from a facet
Pivot faceting
Excluding filters – multiselect faceting
Summary
Chapter 8: Search Components
About components
The highlight component
The SpellCheck component
Query complete/suggest
The QueryElevation component
The MoreLikeThis component
The Stats component
The Clustering component
Collapsing and expanding
The TermVector component
Summary
Chapter 9: Integrating Solr
Working with the included examples
Solritas – the integrated search UI
SolrJ – Solr's Java client API
Using JavaScript/AJAX with Solr
Using XSLT to transform XML search results
Accessing Solr from PHP applications
Ruby on Rails integrations
Nutch for crawling web pages
Solr and Hadoop
ManifoldCF – a connector framework
Document-level security
Summary
Chapter 10: Scaling Solr
Tuning complex systems is hard
Use SolrMeter to test Solr performance
Optimizing a single Solr server – scale up
Configuring Solr for near real-time search
Use SolrCloud to go big – scale wide
Summary
Chapter 11: Deployment
Deployment methodology for Solr
Installing Solr into a Servlet container
Configuring logging
A RequestHandler per search interface
Leveraging Solr cores
Setting up ZooKeeper for SolrCloud
Monitoring Solr performance
Securing Solr from prying eyes
Summary

Book Details

ISBN 139781782161363
Paperback432 pages
Read More