| |
Back to BOOK PAGE
Table of ContentsPreface Chapter 1: Quick Starting Solr Chapter 2: Schema and Text Analysis Chapter 3: Indexing Data Chapter 4: Basic Searching Chapter 5: Enhanced Searching Chapter 6: Search Components Chapter 7: Deployment Chapter 8: Integrating Solr Chapter 9: Scaling Solr Index
- Chapter 1: Quick Starting Solr
- An introduction to Solr
- Lucene, the underlying engine
- Solr, the Server-ization of Lucene
- Comparison to database technology
- Getting started
- The last official release or fresh code from source control
- Testing and building Solr
- Solr's installation directory structure
- Solr's home directory
- How Solr finds its home
- Deploying and running Solr
- A quick tour of Solr!
- Loading sample data
- A simple query
- Some statistics
- The schema and configuration files
- Solr resources outside this book
- Summary
- Chapter 2: Schema and Text Analysis
- MusicBrainz.org
- One combined index or multiple indices
- Problems with using a single combined index
- Schema design
- Step 1: Determine which searches are going to be powered by Solr
- Step 2: Determine the entities returned from each search
- Step 3: Denormalize related data
- Denormalizing—"one-to-one" associated data
- Denormalizing—"one-to-many" associated data
- Step 4: (Optional) Omit the inclusion of fields only used in search results
- The schema.xml file
- Field types
- Field options
- Field definitions
- Sorting
- Dynamic fields
- Using copyField
- Remaining schema.xml settings
- Text analysis
- Configuration
- Experimenting with text analysis
- Tokenization
- WorkDelimiterFilterFactory
- Stemming
- Synonyms
- Index-time versus Query-time, and to expand or not
- Stop words
- Phonetic sounds-like analysis
- Partial/Substring indexing
- Chapter 3: Indexing Data
- Communicating with Solr
- Direct HTTP or a convenient client API
- Data streamed remotely or from Solr's filesystem
- Data formats
- Using curl to interact with Solr
- Remote streaming
- Sending XML to Solr
- Deleting documents
- Commit, optimize, and rollback
- Direct database and XML import
- Getting started with DIH
- The DIH development console
- DIH documents, entities
- DIH fields and transformers
- Indexing documents with Solr Cell
- Extracting binary content
- Configuring Solr
- Extracting karaoke lyrics
- Indexing richer documents
- Chapter 4: Basic Searching
- Your first search, a walk-through
- Solr's generic XML structured data representation
- Solr's XML response format
- Query parameters
- Parameters affecting the query
- Result paging
- Output related parameters
- Diagnostic query parameters
- Query syntax
- Matching all the documents
- Mandatory, prohibited, and optional clauses
- Sub-expressions (aka sub-queries)
- Limitations of prohibited clauses in sub-expressions
- Field qualifier
- Phrase queries and term proximity
- Wildcard queries
- Score boosting
- Existence (and non-existence) queries
- Escaping special characters
- Filtering
- Sorting
- Request handlers
- Scoring
- Query-time and index-time boosting
- Troubleshooting scoring
- Chapter 5: Enhanced Searching
- Function queries
- An example: Scores influenced by a lookupcount
- Field references
- Function reference
- Mathematical primitives
- Miscellaneous math
- ord and rord
- An example with scale() and lookupcount
- Using logarithms
- Using inverse reciprocals
- Using reciprocals and rord with dates
- Dismax Solr request handler
- Lucene's DisjunctionMaxQuery
- Configuring queried fields and boosts
- Limited query syntax
- Boosting: Automatic phrase boosting
- Configuring automatic phrase boosting
- Phrase slop configuration
- Boosting: Boost queries
- Boosting: Boost functions
- Min-should-match
- Basic rules
- Multiple rules
- What to choose
- Faceting
- A quick example: Faceting release types
- MusicBrainz schema changes
- Field requirements
- Types of faceting
- Faceting text
- Alphabetic range bucketing (A-C, D-F, and so on)
- Faceting dates
- Faceting on arbitrary queries
- Excluding filters
- The solution: Local Params
- Facet prefixing (term suggest)
- Chapter 6: Search Components
- About components
- The highlighting component
- A highlighting example
- Highlighting configuration
- Spell checking
- Schema configuration
- Configuration in solrconfig.xml
- Configuring spellcheckers (dictionaries)
- Processing of the q parameter
- Processing of the spellcheck.q parameter
- Building the dictionary from its source
- Issuing spellcheck requests
- Example usage for a mispelled query
- The more-like-this search component
- Configuration parameters
- Parameters specific to the MLT search component
- Parameters specific to the MLT request handler
- Common MLT parameters
- Stats component
- Configuring the stats component
- Statistics on track durations
- Field collapsing
- Configuring field collapsing
- Other components
- Terms component
- termVector component
- LocalSolr component
- Chapter 7: Deployment
- Implementation methodology
- Installing into a Servlet container
- Differences between Servlet containers
- Defining solr.home property
- Logging
- HTTP server request access logs
- Solr application logging
- Configuring logging output
- Logging to Log4j
- Jetty startup integration
- Managing log levels at runtime
- A SearchHandler per search interface
- Solr cores
- Configuring solr.xml
- Managing cores
- Why use multicore
- JMX
- Starting Solr with JMX
- Take a walk on the wild side! Use JRuby to extract JMX information
- Securing Solr
- Securing index data
- Controlling document access
- Other things to look at
- Chapter 8: Integrating Solr
- Structure of included examples
- SolrJ: Simple Java interface
- Using Heritrix to download artist pages
- Indexing HTML in Solr
- SolrJ client API
- When should I use Embedded Solr
- In-Process streaming
- Rich clients
- Upgrading from legacy Lucene
- Using JavaScript to integrate Solr
- Wait, what about security?
- Building a Solr powered artists autocomplete widget with jQuery and JSONP
- SolrJS: JavaScript interface to Solr
- Accessing Solr from PHP applications
- solr-php-client
- Drupal options
- Apache Solr Search integration module
- Hosted Solr by Acquia
- Ruby on Rails integrations
- acts_as_solr
- Setting up MyFaves project
- Populating MyFaves relational database from Solr
- Build Solr indexes from relational database
- Complete MyFaves web site
- Blacklight OPAC
- Indexing MusicBrainz data
- Customizing display
- solr-ruby versus rsolr
- Chapter 9: Scaling Solr
- Tuning complex systems
- Using Amazon EC2 to practice tuning
- Firing up Solr on Amazon EC2
- Optimizing a single Solr server (Scale High)
- JVM configuration
- HTTP caching
- Solr caching
- Schema design considerations
- Indexing strategies
- Disable unique document checking
- Commit/optimize factors
- Enhancing faceting performance
- Using term vectors
- Improving phrase search performance
- Moving to multiple Solr servers (Scale Wide)
- Script versus Java replication
- Starting multiple Solr servers
- Distributing searches across slaves
- Indexing into the master server
- Configuring slaves
- Distributing search queries across slaves
- Sharding indexes
- Assigning documents to shards
- Searching across shards
- Combining replication and sharding (Scale Deep)
- Summary
Back to BOOK PAGE
| |
|