Free Sample
+ Collection

Mastering ElasticSearch

Mastering
Rafał Kuć, Marek Rogoziński

Written for intermediate users, this tutorial helps you utilize the power of Apache Lucene and Elastic Search to optimize your information retrieval. From design to implementation to management, it’s the all-inclusive guide.
$32.99
$54.99
RRP $32.99
RRP $54.99
eBook
Print + eBook

Want this title & more?

$16.99 p/month

Subscribe to PacktLib

Enjoy full and instant access to over 2000 books and videos – you’ll find everything you need to stay ahead of the curve and make sure you can always get the job done.

Book Details

ISBN 139781783281435
Paperback386 pages

About This Book

  • Learn about Apache Lucene and ElasticSearch design and architecture to fully understand how this great search engine works
  • Design, configure, and distribute your index, coupled with a deep understanding of the workings behind it
  • Learn about the advanced features in an easy to read book with detailed examples that will help you understand and use the sophisticated features of ElasticSearch

Who This Book Is For

Mastering ElasticSearch is aimed at to intermediate users who want to extend their knowledge about ElasticSearch. The topics that are described in the book are detailed, but we assume that you already know the basics, like the query DSL or data indexing. Advanced users will also find this book useful, as the examples are getting deep into the internals where it is needed.

Table of Contents

Chapter 1: Introduction to ElasticSearch
Introducing Apache Lucene
Introducing ElasticSearch
Summary
Chapter 2: Power User Query DSL
Default Apache Lucene scoring explained
Query rewrite explained
Rescore
Bulk Operations
Sorting data
Update API
Using filters to optimize your queries
Filter and scopes in ElasticSearch faceting mechanism
Summary
Chapter 3: Low-level Index Control
Altering Apache Lucene scoring
Similarity model configuration
Using codecs
NRT, flush, refresh, and transaction log
Looking deeper into data handling
Segment merging under control
Summary
Chapter 4: Index Distribution Architecture
Choosing the right amount of shards and replicas
Routing explained
Altering the default shard allocation behavior
Adjusting shard allocation
Query execution preference
Using our knowledge
Summary
Chapter 5: ElasticSearch Administration
Choosing the right directory implementation – the store module
Discovery configuration
Segments statistics
Understanding ElasticSearch caching
Summary
Chapter 6: Fighting with Fire
Knowing the garbage collector
When it is too much for I/O – throttling explained
Speeding up queries using warmers
Very hot threads
Real-life scenarios
Summary
Chapter 7: Improving the User Search Experience
Correcting user spelling mistakes
Improving query relevance
Summary
Chapter 8: ElasticSearch Java APIs
Introducing the ElasticSearch Java API
The code
Connecting to your cluster
Anatomy of the API
CRUD operations
Querying ElasticSearch
Performing multiple actions
Percolator
The explain API
Building JSON queries and documents
The administration API
Summary
Chapter 9: Developing ElasticSearch Plugins
Creating the Apache Maven project structure
Creating a custom river plugin
Creating custom analysis plugin
Summary

What You Will Learn

  • Understand how Apache Lucene works
  • Use and configure different scoring models to alter default scoring mechanism
  • Exploit query rescore to recalculate the score of top N documents
  • Choose the right amount of shards and replicas for your deployment
  • Use shards allocation wisely and understand its internals
  • Alter the index format by using different postings format
  • Use your knowledge to create scalable, efficient, and fault tolerant clusters
  • Monitor your cluster by using and understanding the ElasticSearch API
  • Learn to control segments merging and why ElasticSearch uses merging at all
  • Overcome problems with garbage collection, threading, and I/O
  • Improve the user search experience by using ElasticSearch functionality
  • Develop an application using the ElasticSearch Java API and develop custom ElasticSearch plugins

In Detail

ElasticSearch is fast, distributed, scalable, and written in the Java search engine that leverages Apache Lucene capabilities providing a new level of control over how you index and search even the largest set of data.

"Mastering ElasticSearch" covers the intermediate and advanced functionalities of ElasticSearch and will let you understand not only how ElasticSearch works, but will also guide you through its internals such as caches, Apache Lucene library, monitoring capabilities, and the Java API. In addition to that you'll see the practical usage of ElasticSearch configuration parameters, monitoring API, and easy-to-use and extend examples on how to extend ElasticSearch by writing your own plugins.

"Mastering ElasticSearch" starts by showing you how Apache Lucene works and what the ElasticSearch architecture looks like. It covers advanced querying capabilities, index configuration control, index distribution, ElasticSearch administration and troubleshooting. Finally you'll see how to improve the user’s search experience, use the provided Java API and develop your own custom plugins.

It will help you learn how Apache Lucene works both in terms of querying and indexing. You'll also learn how to use different scoring models, rescoring documents using other queries, alter how the index is written by using custom postings and what segments merging is, and how to configure it to your needs. You'll optimize your queries by modifying them to use filters and you'll see why it is important. The book describes in details how to use the shard allocation mechanism present in ElasticSearch such as forced awareness.

"Mastering ElasticSearch" will open your eyes to the practical use of the statistics and information API available for the index, node and cluster level, so you are not surprised about what your ElasticSearch does while you are not looking. You'll also see how to troubleshoot by understanding how the Java garbage collector works, how to control I/O throttling, and see what threads are being executed at the any given moment. If user spelling mistakes are making you lose sleep at night - don't worry anymore the book will show you how to configure and use the ElasticSearch spell checker and improve the query relevance of your queries. Last, but not least you'll see how to use the ElasticSearch Java API to use the ElasticSearch cluster from your JVM based application and you'll extend ElasticSearch by writing your own custom plugins.

If you are looking for a book that will allow you to easily extend your basic knowledge about ElasticSearch or you want to go deeper into the world of full text search using ElasticSearch then this book is for you.

 

Authors

Read More