You're reading from Getting Started with Elastic Stack 8.0

Product typeBook

Published inMar 2022

PublisherPackt

ISBN-139781800569492

Edition1st Edition

Tools

Elasticsearch

Concepts

Enterprise Search

Author (1)

Asjad Athick

Chapter 4: Leveraging Insights and Managing Data on Elasticsearch

In the previous chapter, we looked at getting data into an Elasticsearch cluster and running searches to return relevant results for our application. This chapter will focus on how this data can be leveraged to gain analytical insights. We will also look at some important features that help with manipulating, transforming, and managing data sources when building your use cases.

Specifically, we will focus on the following topics:

Aggregating data for analytical insights
Managing the life cycle of time series data
Manipulating data using ingest pipelines
Responding to changes in data with Watcher

Technical requirements

The code and the relevant artifacts for this chapter can be found in the Chapter 04 folder, in the GitHub repository for this book. This chapter builds on the work we did in Chapter 3, Indexing and Searching for Data.

You can find the code files related to this chapter in the GitHub repository for this book:

https://github.com/PacktPublishing/Getting-Started-with-Elastic-Stack-8.0/tree/main/Chapter4.

Getting insights from data using aggregations

When looking to understand insights in your data, retrieving documents that fit the question you're looking to answer is just the first part of the problem. For example, if an analyst is looking to understand how much traffic their web servers served in a given day, running a query to retrieve logs in the given period may still return millions of events.

Aggregations allow you to summarize large volumes of data into something easier to consume. Elasticsearch can perform two primary types of aggregations:

Metric aggregations can calculate metrics such as count, sum, min, max, and average on numeric data.
Bucket aggregations can be used to organize large datasets into groups, depending on the value of a field. Buckets can be created based on a range, date, the frequency of a term in the search results (or corpus), and so on.

An exhaustive list of all supported aggregations can be found in the Elasticsearch guide...

Managing the life cycle of time series data

Most machine data sources can be characterized as time series data. Logs and metrics generally include a timestamp for recording when the event occurred or was observed. This type of data is generally not updated after it is ingested. Information changes are generally recorded as new events.

The following documents illustrate the append-only nature of time series data:

[
      {
          "sensor_name" : "living_room",
          "lights_on" : 1,
          "timestamp" : "2021-02-14T00:00:00.000Z"
      },
      {
          "sensor_name" : "living_room",
     ...

Manipulating incoming data with ingest pipelines

Elasticsearch is a "schema on write" data store. Once a document has been indexed into Elasticsearch, the field names and values that have been indexed cannot be changed unless the document is reindexed. Therefore, documents must be parsed, transformed, and cleansed before ingestion.

Runtime fields can be used to compute or evaluate the value of a field at query time. Runtime fields can be used to manipulate and transform field values when searching for data, but they can be costly and time-consuming to run across large volumes of search requests. The intended use of runtime fields is to apply temporary or one-off changes to data, rather than on every search request.

Ingest pipelines on Elasticsearch offer lightweight and convenient data transformation and manipulation functionality for when an ETL tool such as Logstash is not used. As ingest pipelines run on Elasticsearch nodes, they can scale easily as part of the...

Responding to changing data with Watcher

From the previous sections, we know how to search for data, aggregate it for analytics, and how to transform documents so that they comply with the desired schema. These capabilities power user-driven data exploration and visualization (using frontend tools such as Kibana). The same capabilities can also be used to provide automated alerting and response actions for your incoming data.

Watcher is a flexible tool that can be used to solve various alerting use cases. The following list describes some of the common alerting use cases:

Alert on a singular event with a particular value:

a. Alert when event.severity: critical

b. Alert when disk_free < 1GB

Alert if event count matching a filter exceeds a threshold:

a. Alert if 10 or more events with event.severity: critical have occurred in the last 5 mins.

b. Alert if 5 or more login_failed events per username have occurred in the last 5 mins.

Alert...

Summary

In this chapter, we understood how data in Elasticsearch can be aggregated for statistical insights. We explored how metric and bucket aggregations help slice and dice a large dataset to analyze data for insights.

We also looked at how ingest pipelines can be used to manipulate and transform incoming data to prepare it for use cases on Elasticsearch. We explored a range of common use cases for ingest pipelines in this section.

Lastly, we looked at how Watcher can be used to implement alerting and response actions to changes in data. Again, we explored a range of common alerting use cases in this section.

In the next chapter, we will dive into getting started with and using machine learning jobs to find anomalies in our data, run inference for new documents using the inference ingest processor, and run transformation jobs to pivot incoming datasets for machine learning.

The rest of the chapter is locked

You have been reading a chapter from

Getting Started with Elastic Stack 8.0

Published in: Mar 2022Publisher: PacktISBN-13: 9781800569492

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Asjad Athick

Asjad Athick is a security specialist at Elastic with demonstratable experience in architecting enterprise-scale solutions on the cloud. He believes in empowering people with the right tools to help them achieve their goals. At Elastic, he works with a broad range of customers across Australia and New Zealand to help them understand their environment; this allows them to build robust threat detection, prevention, and response capabilities. He previously worked in the telecommunications space to build a security capability to help analysts identify and contextualize unknown cyber threats. With a background in application development and technology consulting, he has worked with various small businesses and start-up organizations across Australia.
Read more about Asjad Athick

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages