Lucene 4 Cookbook

Over 70 hands-on recipes to quickly and effectively integrate Lucene into your search application

Lucene 4 Cookbook

This ebook is included in a Mapt subscription
Edwood Ng, Vineeth Mohan

1 customer reviews
Over 70 hands-on recipes to quickly and effectively integrate Lucene into your search application
$0.00
$18.00
$44.99
$29.99p/m after trial
RRP $35.99
RRP $44.99
Subscription
eBook
Print + eBook
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 4,000+ eBooks & Videos
  • 40+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Preview in Mapt

Book Details

ISBN 139781782162285
Paperback220 pages

Book Description

Lucene 4 Cookbook is a practical guide that shows you how to build a scalable search engine for your application, from an internal documentation search to a wide-scale web implementation with millions of records. Starting with helping you to successfully install Apache Lucene, it will guide you through creating your first search application. Furthermore, the book walks you through analyzing your text and indexing your data to leverage the performance of your search application. As you progress through the chapters, you will learn to effectively search your indexes and successfully employ real-time searching.

The chapters start off with simple concepts and build up to complex solutions that should help you on your way to becoming a search engine expert.

Table of Contents

Chapter 1: Introducing Lucene
Introduction
Installing Lucene
Setting up a simple Java Lucene project
Obtaining an IndexWriter
Creating an analyzer
Creating fields
Creating and writing documents to an index
Deleting documents
Obtaining an IndexSearcher
Creating queries with the Lucene QueryParser
Performing a search
Enumerating results
Chapter 2: Analyzing Your Text
Introduction
Obtaining a common analyzer
Obtaining a TokenStream
Obtaining TokenAttribute values
Using PositionIncrementAttribute
Using PerFieldAnalyzerWrapper
Defining custom TokenFilters
Defining custom analyzers
Defining custom tokenizers
Defining custom attributes
Chapter 3: Indexing Your Data
Introduction
Obtaining an IndexWriter
Creating a StringField
Creating a TextField
Creating a numeric field
Creating a DocValue Field
Transactional commits and index versioning
Reusing field and document objects per thread
Delving into field norms
Changing similarity implementation used during indexing
Chapter 4: Searching Your Indexes
Introduction
Obtaining IndexReaders
Un-inverting single-valued fields in memory with FieldCache
TermVectors
IndexSearcher
Constructing queries
Specifying sort logic
Forming a search result
Pagination
Using Collectors
Sorting with custom FieldComparator
Chapter 5: Near Real-time Searching
Introduction
Using the DirectoryReader to open index in Near Real-Time
Using the SearcherManager to refresh IndexSearcher
Generational indexing with TrackingIndexWriter
Maintaining search sessions with SearcherLifetimeManager
Performance tuning: latency and throughput
Chapter 6: Querying and Filtering Data
Introduction
Performing advanced filtering
Creating a custom filter
Searching with QueryParser
TermQuery and TermRangeQuery
BooleanQuery
PrefixQuery and WildcardQuery
PhraseQuery and MultiPhraseQuery
FuzzyQuery
NumericRangeQuery
DisjunctionMaxQuery
RegexpQuery
SpanQuery
CustomScoreQuery
Chapter 7: Flexible Scoring
Introduction
Overriding similarity
Implementing the BM25 model
Implementing the language model
Implementing the divergence from randomness model
Implementing the information-based model
Chapter 8: Introducing Elasticsearch
Introduction
Getting Elasticsearch
Creating a new index
Predefine field mappings
Adding a document
Deleting a document
Updating a document
Performing bulk indexing
Searching the index
Scaling Elasticsearch
Chapter 9: Extending Lucene with Modules
Introduction
Exploring spatial search
Implementing joins
Performing faceting
Implementing grouping
Employing autosuggest
Implementing highlighting

What You Will Learn

  • Explore the best practices to make the most of your search application
  • Create and write documents in an index
  • Customize scoring and boosting in your application to influence search results
  • Expand Lucene's functionality, such as spatial searching and faceting with add-on modules
  • Load and initialize the library and build a search index of data
  • Understand trading between NRT latency and throughput
  • Optimize your search applications by employing features such as near real-time (NRT) search

Authors

Table of Contents

Chapter 1: Introducing Lucene
Introduction
Installing Lucene
Setting up a simple Java Lucene project
Obtaining an IndexWriter
Creating an analyzer
Creating fields
Creating and writing documents to an index
Deleting documents
Obtaining an IndexSearcher
Creating queries with the Lucene QueryParser
Performing a search
Enumerating results
Chapter 2: Analyzing Your Text
Introduction
Obtaining a common analyzer
Obtaining a TokenStream
Obtaining TokenAttribute values
Using PositionIncrementAttribute
Using PerFieldAnalyzerWrapper
Defining custom TokenFilters
Defining custom analyzers
Defining custom tokenizers
Defining custom attributes
Chapter 3: Indexing Your Data
Introduction
Obtaining an IndexWriter
Creating a StringField
Creating a TextField
Creating a numeric field
Creating a DocValue Field
Transactional commits and index versioning
Reusing field and document objects per thread
Delving into field norms
Changing similarity implementation used during indexing
Chapter 4: Searching Your Indexes
Introduction
Obtaining IndexReaders
Un-inverting single-valued fields in memory with FieldCache
TermVectors
IndexSearcher
Constructing queries
Specifying sort logic
Forming a search result
Pagination
Using Collectors
Sorting with custom FieldComparator
Chapter 5: Near Real-time Searching
Introduction
Using the DirectoryReader to open index in Near Real-Time
Using the SearcherManager to refresh IndexSearcher
Generational indexing with TrackingIndexWriter
Maintaining search sessions with SearcherLifetimeManager
Performance tuning: latency and throughput
Chapter 6: Querying and Filtering Data
Introduction
Performing advanced filtering
Creating a custom filter
Searching with QueryParser
TermQuery and TermRangeQuery
BooleanQuery
PrefixQuery and WildcardQuery
PhraseQuery and MultiPhraseQuery
FuzzyQuery
NumericRangeQuery
DisjunctionMaxQuery
RegexpQuery
SpanQuery
CustomScoreQuery
Chapter 7: Flexible Scoring
Introduction
Overriding similarity
Implementing the BM25 model
Implementing the language model
Implementing the divergence from randomness model
Implementing the information-based model
Chapter 8: Introducing Elasticsearch
Introduction
Getting Elasticsearch
Creating a new index
Predefine field mappings
Adding a document
Deleting a document
Updating a document
Performing bulk indexing
Searching the index
Scaling Elasticsearch
Chapter 9: Extending Lucene with Modules
Introduction
Exploring spatial search
Implementing joins
Performing faceting
Implementing grouping
Employing autosuggest
Implementing highlighting

Book Details

ISBN 139781782162285
Paperback220 pages
Read More
From 1 reviews

Read More Reviews