ElasticSearch Server


There is a newer version of this book available - Elasticsearch Server: Second Edition
ElasticSearch Server
eBook: $26.99
Formats: PDF, PacktLib, ePub and Mobi formats
$22.94
save 15%!
Print + free eBook + free PacktLib access to the book: $71.98    Print cover: $44.99
$44.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Overview
Table of Contents
Author
Reviews
Support
Sample Chapters
  • Learn the basics of ElasticSearch like data indexing, analysis, and dynamic mapping
  • Query and filter ElasticSearch for more accurate and precise search results
  • Learn how to monitor and manage ElasticSearch clusters and troubleshoot any problems that arise

Book Details

Language : English
Paperback : 318 pages [ 235mm x 191mm ]
Release Date : February 2013
ISBN : 1849518440
ISBN 13 : 9781849518444
Author(s) : Rafał Kuć, Marek Rogoziński
Topics and Technologies : All Books, Big Data and Business Intelligence, Open Source

Table of Contents

Preface
Chapter 1: Getting Started with ElasticSearch Cluster
Chapter 2: Searching Your Data
Chapter 3: Extending Your Structure and Search
Chapter 4: Make Your Search Better
Chapter 5: Combining Indexing, Analysis, and Search
Chapter 6: Beyond Searching
Chapter 7: Administrating Your Cluster
Chapter 8: Dealing with Problems
Index
  • Chapter 1: Getting Started with ElasticSearch Cluster
    • What is ElasticSearch?
      • Index
      • Document
      • Document type
      • Node and cluster
      • Shard
      • Replica
    • Installing and configuring your cluster
    • Directory structure
    • Configuring ElasticSearch
    • Running ElasticSearch
    • Shutting down ElasticSearch
    • Running ElasticSearch as a system service
    • Data manipulation with REST API
      • What is REST?
      • Storing data in ElasticSearch
      • Creating a new document
      • Retrieving documents
      • Updating documents
      • Deleting documents
    • Manual index creation and mappings configuration
      • Index
      • Types
      • Index manipulation
      • Schema mapping
        • Type definition
        • Fields
        • Core types
        • Multi fields
        • Using analyzers
        • Storing a document source
        • All field
    • Dynamic mappings and templates
      • Type determining mechanism
      • Dynamic mappings
      • Templates
        • Storing templates in files
    • When routing does matter
      • How does indexing work?
      • How does searching work?
      • Routing
      • Routing parameters
      • Routing fields
    • Index aliasing and simplifying your everyday work using it
      • An alias
      • Creating an alias
      • Modifying aliases
      • Combining commands
      • Retrieving all aliases
      • Filtering aliases
      • Aliases and routing
    • Summary
    • Chapter 2: Searching Your Data
      • Understanding the querying and indexing process
      • Mappings
        • Data
      • Querying ElasticSearch
        • Simple query
        • Paging and results size
        • Returning the version
        • Limiting the score
        • Choosing the fields we want to return
          • Partial fields
        • Using script fields
          • Passing parameters to script fields
        • Choosing the right search type (advanced)
        • Search execution preference (advanced)
      • Basic queries
        • The term query
        • The terms query
        • The match query
          • The Boolean match query
          • The phrase match query
          • The match phrase prefix query
        • The multi match query
        • The query string query
          • Lucene query syntax
          • Explaining the query string
          • Running query string query against multiple fields
        • The field query
        • The identifiers query
        • The prefix query
        • The fuzzy like this query
        • The fuzzy like this field query
        • The fuzzy query
        • The match all query
        • The wildcard query
        • The more like this query
        • The more like this field query
        • The range query
        • Query rewrite
      • Filtering your results
        • Using filters
        • Range filters
        • Exists
        • Missing
        • Script
        • Type
        • Limit
        • IDs
        • If this is not enough
        • bool, and, or, not filters
        • Named filters
        • Caching filters
      • Compound queries
        • The bool query
        • The boosting query
        • The constant score query
        • The indices query
        • The custom filters score query
        • The custom boost factor query
        • The custom score query
      • Sorting data
        • Default sorting
        • Selecting fields used for sorting
        • Specifying behavior for missing fields
        • Dynamic criteria
        • Collation and national characters
      • Using scripts
        • Available objects
        • MVEL
        • Other languages
        • Script library
        • Native code
      • Summary
      • Chapter 3: Extending Your Structure and Search
        • Indexing data that is not flat
          • Data
          • Objects
          • Arrays
          • Mappings
            • Final mappings
          • To be or not to be dynamic
          • Sending the mappings to ElasticSearch
        • Extending your index structure with additional internal information
          • The identifier field
          • The _type field
          • The _all field
          • The _source field
          • The _boost field
          • The _index field
          • The _size field
          • The _timestamp field
          • The _ttl field
        • Highlighting
          • Getting started with highlighting
          • Field configuration
          • Under the hood
          • Configuring HTML tags
          • Controlling highlighted fragments
          • Global and local settings
          • Require matching
        • Autocomplete
          • The prefix query
          • Edge ngrams
          • Faceting
        • Handling files
          • Additional information about a file
        • Geo
          • Mapping preparation for spatial search
          • Example data
          • Sample queries
          • Bounding box filtering
          • Limiting the distance
        • Summary
        • Chapter 4: Make Your Search Better
          • Why this document was found
            • Understanding how a field is analyzed
            • Explaining the query
          • Influencing scores with query boosts
            • What is boost?
            • Adding boost to queries
            • Modifying the score
              • Constant score query
              • Custom boost factor query
              • Boosting query
              • Custom score query
              • Custom filters score query
          • When does index-time boosting make sense
            • Defining field boosting in input data
            • Defining document boosting in input data
            • Defining boosting in mapping
          • The words having the same meaning
            • Synonym filter
              • Synonyms in mappings
              • Synonyms in files
            • Defining synonym rules
              • Using Apache Solr synonyms
              • Using WordNet synonyms
            • Query- or index-time synonym expansion
          • Searching content in different languages
            • Why we need to handle languages differently
            • How to handle multiple languages
            • Detecting a document's language
            • Sample document
            • Mappings
            • Querying
              • Queries with a known language
              • Queries with an unknown language
              • Combining queries
          • Using span queries
            • What is a span?
            • Span term query
            • Span first query
            • Span near query
            • Span or query
            • Span not query
            • Performance considerations
          • Summary
          • Chapter 5: Combining Indexing, Analysis, and Search
            • Indexing tree-like structures
            • Modifying your index structure with the update API
              • The mapping
              • Adding a new field
              • Modifying fields
            • Using nested objects
            • Using parent-child relationships
              • Mappings and indexing
                • Creating parent mappings
                • Creating child mappings
                • Parent document
                • Child documents
              • Querying
                • Querying for data in the child documents
                • The top children query
                • Querying for data in the parent documents
              • Parent-child relationship and filtering
              • Performance considerations
            • Fetching data from other systems: river
              • What we need and what a river is
              • Installing and configuring a river
            • Batch indexing to speed up your indexing process
              • How to prepare data
              • Indexing the data
              • Is it possible to do it quicker?
            • Summary
            • Chapter 6: Beyond Searching
              • Faceting
                • Document structure
                • Returned results
                • Query
                • Filter
                • Terms
                • Range
                  • Choosing different fields for aggregated data calculation
                • Numerical and date histogram
                  • Date histogram
                • Statistical
                • Terms statistics
                • Spatial
                • Filtering faceting results
                • Scope of your faceting calculation
                  • Facet calculation on all nested documents
                  • Facet calculation on nested documents that match a query
                • Faceting memory considerations
              • More like this
                • Example data
                • Finding similar documents
              • Percolator
                • Preparing the percolator
                • Getting deeper
              • Summary
              • Chapter 7: Administrating Your Cluster
                • Monitoring your cluster state and health
                  • The cluster health API
                  • The indices stats API
                    • Docs
                    • Store
                    • Indexing, get, and search
                  • The status API
                  • The nodes info API
                  • The nodes stats API
                  • The cluster state API
                  • The indices segments API
                • Controlling shard and replica allocation
                  • Explicitly controlling allocation
                    • Specifying nodes' parameters
                    • Configuration
                    • Index creation
                    • Excluding nodes from allocation
                    • Using IP addresses for shard allocation
                  • Cluster-wide allocation
                  • Number of shards and replicas per node
                  • Manually moving shards and replicas
                    • Moving shards
                    • Canceling allocation
                    • Allocating shards
                    • Multiple commands per HTTP request
                • Tools for instance and cluster state diagnosis
                  • Bigdesk
                  • elasticsearch-head
                  • elasticsearch-paramedic
                  • SPM for ElasticSearch
                • Your ElasticSearch time machine
                  • The gateway module
                    • Local gateway
                    • Shared filesystem gateway
                    • Hadoop distributed filesystem gateway
                    • Amazon s3 gateway
                  • Recovery control
                • Node discovery
                  • Discovery types
                  • Master node
                    • Configuring master and data nodes
                    • Master election configuration
                  • Setting the cluster name
                  • Configuring multicast
                  • Configuring unicast
                  • Nodes ping settings
                • ElasticSearch plugins
                  • Installing plugins
                  • Removing plugins
                  • Plugin types
                • Summary
                • Chapter 8: Dealing with Problems
                  • Why is the result on later pages slow
                    • What is the problem?
                    • Scrolling to the rescue
                  • Controlling cluster rebalancing
                    • What is rebalancing?
                    • When is the cluster ready?
                    • The cluster rebalancing settings
                      • Controlling when rebalancing will start
                      • Controlling the number of shards being moved between nodes concurrently
                      • Controlling the number of shards initialized concurrently on a single node
                      • Controlling the number of primary shards initialized concurrently on a single node
                      • Disabling the allocation of shards and replicas
                      • Disabling the allocation of replicas
                  • Validating your queries
                    • How to use the Validate API
                  • Warming up
                    • Defining a new warming query
                    • Retrieving defined warming queries
                    • Deleting a warming query
                    • Disabling the warming up functionality
                    • Which queries to choose
                  • Summary

                  Rafał Kuć

                  Rafał Kuć is a born team leader and software developer. He currently works as a consultant and a software engineer at Sematext Group, Inc., where he concentrates on open source technologies such as Apache Lucene and Solr, Elasticsearch, and Hadoop stack. He has more than 12 years of experience in various branches of software, from banking software to e-commerce products. He focuses mainly on Java but is open to every tool and programming language that will make the achievement of his goal easier and faster. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people with the problems they face with Solr and Lucene. Also, he has been a speaker at various conferences around the world, such as Lucene Eurocon, Berlin Buzzwords, ApacheCon, and Lucene Revolution.

                  Rafał began his journey with Lucene in 2002, and it wasn't love at first sight. When he came back to Lucene in late 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then, Solr came along and this was it. He started working with Elasticsearch in the middle of 2010. Currently, Lucene, Solr, Elasticsearch, and information retrieval are his main points of interest.

                  Rafał is also the author of Apache Solr 3.1 Cookbook, and the update to it, Apache Solr 4 Cookbook. Also, he is the author of the previous edition of this book and Mastering ElasticSearch. All these books have been published by Packt Publishing.


                  Marek Rogoziński

                  Marek Rogoziński is a software architect and consultant with more than 10 years of experience. His specialization concerns solutions based on open source projects such as Solr and ElasticSearch. He is also the co-funder of the solr.pl site, publishing information and tutorials about the Solr and Lucene library. He currently holds the position of Chief Technology Officer in Smartupz, the vendor of the Discourse™ social collaboration software.

                  Code Downloads

                  Download the code and support files for this book.


                  Submit Errata

                  Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


                  Errata

                  - 3 submitted: last submission 25 Apr 2014

                  Errata Type: Typo Errata Page: 29

                  Description : snowball: Ths is an analyzer similar...
                  Correction : snowball: This is an analyzer similar...

                  Errata Type: Technical Errata Page: 24

                  Description : So let's concentrate on a single field now, for example, the name field, whose definition is as follows:
                  Correction : So let's concentrate on a single field now, for example, the contents field, whose definition is as follows:

                  Errata Type: Technical Errata Page: 20

                  Description : In the preceding example, if the document we are updating doesn't have a value in the counter field, the value of 0 will be used.
                  Correction : If the document does not already exists, the content of the upsert element will be used to index the fresh doc: (see http://www.elasticsearch.org/guide/reference/api/update/)

                  Sample chapters

                  You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

                  Frequently bought together

                  ElasticSearch Server +    PhoneGap 3.x Mobile Application Development Hotshot =
                  50% Off
                  the second eBook
                  Price for both: $41.55

                  Buy both these recommended eBooks together and get 50% off the cheapest eBook.

                  What you will learn from this book

                  • Configuration and creation of an ElasticSearch Index
                  • Using ElasticSearch query DSL to make all kinds of queries
                  • Efficient and precise use of filters without loss of performance
                  • Implementing the autocomplete functionality
                  • Highlight data and geographical search information for better results
                  • Understand how ElasticSearch returns results and how to validate those results
                  • Faceting and “more like this” functionalities to get more from your search and improve your client’s search experience
                  • Monitor your cluster state and health by using ElasticSearch API as well as third party monitoring solutions

                  In Detail

                  ElasticSearch is an open source search server built on Apache Lucene. It was built to provide a scalable search solution with built-in support for near real-time search and multi-tenancy.

                  Jumping into the world of ElasticSearch by setting up your own custom cluster, this book will show you how to create a fast, scalable, and flexible search solution. By learning the ins-and-outs of data indexing and analysis, "ElasticSearch Server" will start you on your journey to mastering the powerful capabilities of ElasticSearch. With practical chapters covering how to search data, extend your search, and go deep into cluster administration and search analysis, this book is perfect for those new and experienced with search servers.

                  In "ElasticSearch Server" you will learn how to revolutionize your website or application with faster, more accurate, and flexible search functionality. Starting with chapters on setting up your own ElasticSearch cluster and searching and extending your search parameters you will quickly be able to create a fast, scalable, and completely custom search solution.

                  Building on your knowledge further you will learn about ElasticSearch’s query API and become confident using powerful filtering and faceting capabilities. You will develop practical knowledge on how to make use of ElasticSearch’s near real-time capabilities and support for multi-tenancy.

                  Your journey then concludes with chapters that help you monitor and tune your ElasticSearch cluster as well as advanced topics such as shard allocation, gateway configuration, and the discovery module.

                  Approach

                  This book is written in friendly, practical style with numerous hands-on examples and tutorials throughout.

                  Who this book is for

                  This book is written for developers who wish to leverage ElasticSearch to create a fast and flexible search solution. If you are looking to learn ElasticSearch or become more proficient then this book is for you. You do not need know anything about ElasticSeach, Java, or Apache Lucene in order to use this book, though basic knowledge about databases and queries is required.

                  Code Download and Errata
                  Packt Anytime, Anywhere
                  Register Books
                  Print Upgrades
                  eBook Downloads
                  Video Support
                  Contact Us
                  Awards Voting Nominations Previous Winners
                  Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
                  Resources
                  Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software