Apache Solr 3 Enterprise Search Server

Enhance your search with faceted navigation, result highlighting, relevancy ranked sorting, and more with this book and ebook

Apache Solr 3 Enterprise Search Server

Starting
David Smiley, Eric Pugh

Enhance your search with faceted navigation, result highlighting, relevancy ranked sorting, and more with this book and ebook
$29.99
$49.99
RRP $29.99
RRP $49.99
eBook
Print + eBook
$12.99 p/month

Want this title & more? Subscribe to PacktLib

Enjoy full and instant access to over 2000 books and videos – you’ll find everything you need to stay ahead of the curve and make sure you can always get the job done.
+ Collection
Free Sample

Book Details

ISBN 139781849516068
Paperback418 pages

About This Book

  • Comprehensive information on Apache Solr 3 with examples and tips so you can focus on the important parts
  • Integration examples with databases, web-crawlers, XSLT, Java & embedded-Solr, PHP & Drupal, JavaScript, Ruby frameworks
  • Advice on data modeling, deployment considerations to include security, logging, and monitoring, and advice on scaling Solr and measuring performance
  • An update of the best-selling title on Solr 1.4

Appendix

Who This Book Is For

This book is for developers who want to learn how to use Apache Solr in their applications. Only basic programming skills are needed.

Table of Contents

Chapter 1: Quick Starting Solr
An introduction to Solr
Getting started
A quick tour of Solr
Configuration files
Resources outside this book
Summary
Chapter 2: Schema and Text Analysis
MusicBrainz.org
One combined index or separate indices
Schema design
The schema.xml file
Text analysis
Summary
Chapter 3: Indexing Data
Communicating with Solr
Solr's Update-XML format
Commit, optimize, and rollback
Sending CSV formatted data to Solr
The Data Import Handler Framework
Indexing documents with Solr Cell
Update request processors
Summary
Chapter 4: Searching
Your first search, a walk-through
Solr's generic XML structured data representation
Solr's XML response format
Request handlers
Query parameters
Query parsers and local-params
Query syntax (the lucene query parser)
The Dismax query parser (part 1)
Filtering
Sorting
Geospatial search
Summary
Chapter 5: Search Relevancy
Scoring
Dismax query parser (part 2)
Function queries
Summary
Chapter 6: Faceting
A quick example: Faceting release types
Field requirements
Types of faceting
Faceting field values
Faceting numeric and date ranges
Facet queries
Building a filter query from a facet
Excluding filters (multi-select faceting)
Hierarchical faceting
Summary
Chapter 7: Search Components
About components
The Highlight component
The SpellCheck component
Query complete / suggest
The QueryElevation component
The MoreLikeThis component
The Stats component
The Clustering component
Result grouping/Field collapsing
The TermVector component
Summary
Chapter 8: Deployment
Deployment methodology for Solr
Installing Solr into a Servlet container
Logging
A SearchHandler per search interface?
Leveraging Solr cores
Monitoring Solr performance
Securing Solr from prying eyes
Summary
Chapter 9: Integrating Solr
Working with included examples
Solritas, the integrated search UI
SolrJ: Simple Java interface
Using JavaScript with Solr
Using XSLT to expose Solr via OpenSearch
Accessing Solr from PHP applications
Ruby on Rails integrations
Nutch for crawling web pages
Maintaining document security with ManifoldCF
Summary
Chapter 10: Scaling Solr
Tuning complex systems
Testing Solr performance with SolrMeter
Optimizing a single Solr server (Scale up)
Moving to multiple Solr servers (Scale horizontally)
Combining replication and sharding (Scale deep)
Where next for scaling Solr?
Summary

What You Will Learn

  • Design a schema to include text indexing details like tokenization, stemming, and synonyms
  • Import data using various formats like CSV, XML, and from databases, and extract text from common document formats
  • Search using Solr’s rich query syntax, perform geospatial searches, and influence relevancy order
  • Enhance search results with faceting, query spell-checking, auto-completing queries, highlighted search results, and more
  • Integrate a host of technologies with Solr from the server side to client-side JavaScript, to frameworks like Drupal
  • Scale Solr by learning how to tune it and how to use replication and sharding

In Detail

If you are a developer building an app today then you know how important a good search experience is. Apache Solr, built on Apache Lucene, is a wildly popular open source enterprise search server that easily delivers powerful search and faceted navigation features that are elusive with databases. Solr supports complex search criteria, faceting, result highlighting, query-completion, query spell-check, relevancy tuning, and more.

Apache Solr 3 Enterprise Search Server is a comprehensive reference guide for every feature Solr has to offer. It serves the reader right from initiation to development to deployment. It also comes with complete running examples to demonstrate its use and show how to integrate Solr with other languages and frameworks.

Through using a large set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project, you will have a testing ground for Solr, and will learn how to import this data in various ways. You will then learn how to search this data in different ways, including Solr's rich query syntax and "boosting" match scores based on record data.
Finally, we'll cover various deployment considerations to include indexing strategies and performance-oriented configuration that will enable you to scale Solr to meet the needs of a high-volume site.

Authors

Table of Contents

Chapter 1: Quick Starting Solr
An introduction to Solr
Getting started
A quick tour of Solr
Configuration files
Resources outside this book
Summary
Chapter 2: Schema and Text Analysis
MusicBrainz.org
One combined index or separate indices
Schema design
The schema.xml file
Text analysis
Summary
Chapter 3: Indexing Data
Communicating with Solr
Solr's Update-XML format
Commit, optimize, and rollback
Sending CSV formatted data to Solr
The Data Import Handler Framework
Indexing documents with Solr Cell
Update request processors
Summary
Chapter 4: Searching
Your first search, a walk-through
Solr's generic XML structured data representation
Solr's XML response format
Request handlers
Query parameters
Query parsers and local-params
Query syntax (the lucene query parser)
The Dismax query parser (part 1)
Filtering
Sorting
Geospatial search
Summary
Chapter 5: Search Relevancy
Scoring
Dismax query parser (part 2)
Function queries
Summary
Chapter 6: Faceting
A quick example: Faceting release types
Field requirements
Types of faceting
Faceting field values
Faceting numeric and date ranges
Facet queries
Building a filter query from a facet
Excluding filters (multi-select faceting)
Hierarchical faceting
Summary
Chapter 7: Search Components
About components
The Highlight component
The SpellCheck component
Query complete / suggest
The QueryElevation component
The MoreLikeThis component
The Stats component
The Clustering component
Result grouping/Field collapsing
The TermVector component
Summary
Chapter 8: Deployment
Deployment methodology for Solr
Installing Solr into a Servlet container
Logging
A SearchHandler per search interface?
Leveraging Solr cores
Monitoring Solr performance
Securing Solr from prying eyes
Summary
Chapter 9: Integrating Solr
Working with included examples
Solritas, the integrated search UI
SolrJ: Simple Java interface
Using JavaScript with Solr
Using XSLT to expose Solr via OpenSearch
Accessing Solr from PHP applications
Ruby on Rails integrations
Nutch for crawling web pages
Maintaining document security with ManifoldCF
Summary
Chapter 10: Scaling Solr
Tuning complex systems
Testing Solr performance with SolrMeter
Optimizing a single Solr server (Scale up)
Moving to multiple Solr servers (Scale horizontally)
Combining replication and sharding (Scale deep)
Where next for scaling Solr?
Summary

Book Details

ISBN 139781849516068
Paperback418 pages
Read More