Packt+ | Advance your knowledge in tech

You're reading from Mastering Elastic Stack

Product typeBook

Published inFeb 2017

PublisherPackt

ISBN-139781786460011

Edition1st Edition

Tools

Elasticsearch

Concepts

Enterprise Search

Authors (2):

Ravi Kumar Gupta

Yuvraj Gupta

View More author details

Chapter 2. Stepping into Elasticsearch

In the previous chapter, we learned the basics of Elasticsearch, Logstash, Kibana, and Beats, and how to install and configure them to set the pipeline. We came to know the role of Elasticsearch, and the way it worked with other components of the stack. This was just the tip of the iceberg. To get a better idea of how Elasticsearch works, we need to learn about the APIs, modules, and plugins it offers. These topics are divided in two chapters.

We're going to take a deep dive into Elasticsearch in this chapter. These are the topics that we are going to cover:

The beginning of Elasticsearch
Understanding the architecture
Elasticsearch APIs
Aggregation
A note for painless scripting

At the end of this chapter, you should have a good idea about how to use aggregations, and the power of APIs. There will be more about Elasticsearch, which will be covered in Chapter 8, Elasticsearch APIs.

The beginning of Elasticsearch

It all started with Lucene, a brilliant project supported by Apache Software Foundation. There is a good list of Lucene-based projects. To name a few - Apache Solr, Elasticsearch, Apache Nutch, Lucene.Net, DocFetcher, and many more. If you ever try to find a search engine kind of solution, you will surely come across Lucene. It's not only available for Java, but also for Delphi, Perl, C#, C++, Python, Ruby, and PHP. A complete list of Lucene implementation is available at http://wiki.apache.org/lucene-java/LuceneImplementations.

Lucene is a full text search engine and it creates indices on documents. In a paragraph or blob of text, every string is called a term and a sequence of terms is named as a field, and a sequence of fields is named a document. An index contains a sequence of documents and it indexes data as documents.

In books, we usually see an index where all the keywords are written and which helps us to find the actual content. This type of index is...

Understanding the architecture

To understand how Elasticsearch works, it's necessary that we learn about the architecture of it.

To understand how index, types, documents, and fields work together, let's refer to the following figure:

As seen in the preceding figure, an index contains one or multiple types. A type can be thought of as a table in a relational database. A type has one or more documents. There are one or more fields in the document. Fields are key value pairs.

A cluster has one or more nodes. Clusters are identified by their names. By default, elasticsearch is the name of the cluster. In case you have to set up multiple Elasticsearch instances, in the same network, you should keep different names or else all nodes will join the same cluster. Similar to clusters, a node also has a name. We can assign it a name and a cluster name to join. In case we don't provide a cluster name to join, then nodes will automatically search and join the cluster with the name elasticsearch.

If we...

Elasticsearch APIs

There are many APIs available for managing Elasticsearch. These APIs help us to manage cluster, indices, search, and so on. In this section, we will look at each of these APIs in detail.

We can use these APIs through Command Prompt, Console in Kibana, or any tool that can make calls to RESTful APIs.

Note

By default, Elasticsearch runs on port 9200 to listen to HTTP requests. Kibana uses the same port to connect to Elasticsearch. To learn more about Console, refer to Chapter 4, Kibana Interface, Exploring Dev tools section.

Sense is a powerful plugin for Kibana that allows us to make calls to Elasticsearch APIs using a web interface. We will be learning about Sense in Chapter 8, Elasticsearch APIs. For this chapter, we will be using cURL, a Command Prompt utility that allows us to access HTTP requests to access the APIs.

A typical cURL request against ES contains a verb, URL, and message body:

$ curl -X{Verb} 'url' -d '{message-body}'

Verbs are GET, PUT, POST, DELETE, and HEAD...

Query DSL

In this manner, we need to provide a request body with the uri just like we have been using for Document APIs. We can rewrite our author search query as follows:

$ curl -XGET 'http://localhost:9200/library/book/_search?pretty' -d '{
    "query" : {
      "term" : {"author" : "gupta"}
    }
  }'

This query will return the same result. Whatever query parameters we defined using q=, we define them in term. To learn more about Query DSL, refer to https://www.elastic.co/guide/en/elasticsearch/reference/5.1/query-dsl.html.

Aggregations

This framework is a very important part of Elasticsearch. As the name suggests, this framework helps us to do aggregations and generate analytic information on result of a search query. Aggregations help us to get better insight of the data. For example, if we take our library index into account, we can get answers to: How many books in a specific year, which technology, average book per year, and many more.

These aggregations show their power when it comes to gaining insight of system data on a dashboard. Most often system dashboards have aggregated data in form of charts. We will also be using aggregations in later chapters and those aggregations will help Kibana to generate useful visualizations.

There are two types of core aggregations: metrics and buckets. We will learn about these in this section.

Bucket

These aggregations create buckets of documents based on a criterion. These types of aggregations can also hold sub-aggregations. We will learn about sub-aggregations in this...

A note for painless scripting

There are times when we use scripts, update data, scripted fields, and many more use cases. Prior to version 5.x, groovy was the default language for your scripts. We even did not specify which scripts we wanted to use back then. Since these scripts were executed remotely security was always a concern that Elastic Team had to address. This became the reason for designing Painless.

Painless is both secure and efficient when it comes to performance. It has a similar syntax as of Groovy so it is also easy to learn and use. For most of the cases, you don't need to make changes to your previously written scripts. All you need to add is a parameter called lang and specify the value as painless.

To define a variable in painless, simply use the following:

def myVar = 'my-value';

We don't need to specify any type. At runtime, the type of variable will be detected whatever suits appropriate. Painless supports all variable types defined by Java.

To define an array, use...

Summary

In this chapter, we learned about Elasticsearch architecture and the way Elasticsearch was born. Later we got familiar with Elasticsearch APIs - Search, Indices, and Document. With the help of these APIs we learned how to add documents to Elasticsearch, how to query those documents, managing the indices. At the end of the chapter, aggregations show how to effectively search documents. We will be practicing these concepts in the next chapters with more examples.

In the next chapter, we will learn about Logstash, and how to configure Logstash for complex data types, and Logstash plugins.

The rest of the chapter is locked

You have been reading a chapter from

Mastering Elastic Stack

Published in: Feb 2017Publisher: PacktISBN-13: 9781786460011

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Ravi Kumar Gupta

Ravi Kumar Gupta is an author, reviewer, and open source software evangelist. He pursued an MS degree in software system at BITS Pilani and a B.Tech at LNMIIT, Jaipur. His technological forte is portal management and development. He is currently working with Azilen Technologies, where he acts as a Technical Architect and Project Manager. His previous assignment was as a lead consultant with CIGNEX Datamatics. He was a core member of the open source group at TCS, where he started working on Liferay and other UI technologies. During his career, he has been involved in building enterprise solutions using the latest technologies with rich user interfaces and open source tools. He loves to spend time writing, learning, and discussing new technologies. His interest in search engines and that small project on crawler during college time made him a technology lover. He is one of the authors of Test-Driven JavaScript Development, Packt Publishing. He is an active member of the Liferay forum. He also writes technical articles for his blog at TechD of Computer World (http://techdc.blogspot.in). He has been a Liferay trainer at TCS and CIGNEX, where he has provided training on Liferay 5.x and 6.x versions. He was also a reviewer for Learning Bootstrap, Packt Publishing. He can be reached on Skype at kravigupta, on Twitter at @kravigupta, and on LinkedIn at https://in.linkedin.com/in/kravigupta.
Read more about Ravi Kumar Gupta

Yuvraj Gupta

Yuvraj Gupta is an author and a keen technologist with interest towards Big Data, Data Analytics, Data Visualization, and Cloud Computing. He has been working as a Big Data Consultant primarily in domain of Big Data Testing. He loves to spend time writing on various social platforms. He is an avid gadget lover, a foodie, a sports enthusiast and love to watch tv-series or movies. He always keep himself updated with the latest happenings in technology. He has authored a book titled Kibana Essentials with Packt Publishers. He can be reached at gupta.yuvraj@gmail.com or at LinkedIn www.linkedin.com/in/guptayuvraj.
Read more about Yuvraj Gupta

Other recommended products

Related to this chapter

Kibana 7 Quick Start Guide

Kibana is the visualization tool of the Elastic Stack, used for visualizing the results of the queries as well the dashboards generated out of the Elasticsearch and Logstash components. This book contains core concepts of Kibana with a straightforward form of chapters so that reader can move forward in a step by step manner.

BookJan 2019172 pages

Learning Kibana 7

This book will introduce you to Kibana 7, and will show you how it fits into the Elastic stack. You will build a pure metric analytics architecture and visualize it using Timelion. You will also learn how to build relationships between documents using Graph visualization. You will also learn to build powerful Elastic dashboards using Kibana.

BookJul 2019280 pages

Elasticsearch 7 Quick Start Guide

Elasticsearch is one of the most popular tools for distributed search. This book will help you in understanding all about the new features of Elasticsearch 7, and how to use them efficiently for searching, aggregating and indexing data with speed and accuracy.

BookOct 2019186 pages

Learning Elastic Stack 6.0

This book will give you a fundamental understanding of what the stack is all about, and how to use it efficiently to build powerful real-time data processing applications. It provide in-depth coverage of the different components of the Elastic Stack, and how to use them all together.

BookDec 2017434 pages

Mastering Kibana 6.x

Mastering Kibana 6.x provides a rundown explanation required for data visualization and analysis such as X-Pack features, Beats, and machine learning. You will be expert in creating analytics-driven visualizations from a web application. You will be a maestro in creating custom monitoring dashboard using Beats with various examples

BookJul 2018376 pages

Learning Elastic Stack 7.0

This book teaches you about every component of the Elastic Stack - including Elasticsearch, Kibana, Logstash, and X-pack - with new and the updated features that are released with the 7.0 version. With the help of this book, you will be able to develop enterprise-grade distributed search and analytics applications for your data without any hassle.

BookMay 2019474 pages

Learning Elasticsearch

Elasticsearch is a Lucene-based search and analytics engine for distributed search and analytics. This book will be your hands-on guide as you explore and put to use the features of Elasticsearch 5.x.

BookJun 2017404 pages

Mastering Elasticsearch 5.x

This book will help you leverage Elasticsearch, guiding you through everything from writing and creating customized plugins to extend Elasticsearch to tackling challenges while handling relational data in Elasticsearch. You’ll learn with the help of practical examples in a step-by-step way.

BookFeb 2017428 pages

Elasticsearch 5.x Cookbook

BookFeb 2017696 pages

Learning Kibana 5.0

BookFeb 2017284 pages

Advanced Elasticsearch 7.0

Advanced Elasticsearch 7.0, will help the readers to leverage new features and Core APIs of Elasticsearch to perform advanced search operations. This book covers data modeling, aggregations, pipeline processing, and data Analytics using Elasticsearch

BookAug 2019560 pages

Elasticsearch 7.0 Cookbook

This book is your one-stop guide to master Elasticsearch. It provides numerous problem-solution based recipes through which you can implement Elasticsearch in your enterprise applications in a very simple, hassle-free way.

BookApr 2019724 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages