Machine Learning with the Elastic Stack - Second Edition

By Rich Collier , Camilla Montonen , Bahaaldine Azarmi
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Chapter 1: Machine Learning for IT

About this book

Elastic Stack, previously known as the ELK stack, is a log analysis solution that helps users ingest, process, and analyze search data effectively. With the addition of machine learning, a key commercial feature, the Elastic Stack makes this process even more efficient. This updated second edition of Machine Learning with the Elastic Stack provides a comprehensive overview of Elastic Stack's machine learning features for both time series data analysis as well as for classification, regression, and outlier detection.

The book starts by explaining machine learning concepts in an intuitive way. You'll then perform time series analysis on different types of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you'll deploy machine learning within Elastic Stack for logging, security, and metrics. Finally, you'll discover how data frame analysis opens up a whole new set of use cases that machine learning can help you with.

By the end of this Elastic Stack book, you'll have hands-on machine learning and Elastic Stack experience, along with the knowledge you need to incorporate machine learning in your distributed search and data analysis platform.

Publication date:
May 2021
Publisher
Packt
Pages
450
ISBN
9781801070034

 

Chapter 2: Enabling and Operationalization

We have just learned the basics of what Elastic ML is doing to accomplish both unsupervised automated anomaly detection and supervised data frame analysis. Now it is time to get detailed about how Elastic ML works inside the Elastic Stack (Elasticsearch and Kibana).

This chapter will focus on both the installation (really, the enablement) of Elastic ML features and a detailed discussion of the logistics of the operation, especially with respect to anomaly detection. Specifically, we will cover the following topics:

  • Enabling Elastic ML features
  • Understanding operationalization
 

Technical requirements

The information in this chapter will use the Elastic Stack as it exists in v7.10 and the workflow of the Elasticsearch Service of Elastic Cloud as of November 2020.

 

Enabling Elastic ML features

The process for enabling Elastic ML features inside the Elastic Stack is slightly different if you are doing so within a self-managed cluster versus using the Elasticsearch Service (ESS) of Elastic Cloud. In short, on a self-managed cluster, the features of ML are enabled via a license key (either a commercial key or a trial key). In ESS, a dedicated ML node needs to be provisioned within the cluster in order to utilize Elastic ML. In the following sections, we will explain the details of how this is accomplished in both scenarios.

Enabling ML on a self-managed cluster

If you have a self-managed cluster that was created from the downloading of Elastic's default distributions of Elasticsearch and Kibana (available at elastic.co/downloads/), enabling Elastic ML features via a license key is very simple. Be sure to not use the Apache 2.0 licensed open source distributions that do not contain the X-Pack code base.

Elastic ML, unlike the bulk...

 

Understanding operationalization

At some point on your journey with using Elastic ML, it will be helpful to understand a number of key concepts regarding how Elastic ML is operationalized within the Elastic Stack. This includes information about how the analytics run on the cluster nodes and how data that is to be analyzed by ML is retrieved and processed.

Note

Some concepts in this section may not be intuitive until you actually start using Elastic ML on some real examples. Don't worry if you feel like you prefer to skim (or even skip) this section now and return to it later following some genuine experience of using Elastic ML.

ML nodes

First and foremost, since Elasticsearch is, by nature, a distributed multi-node solution, it is only natural that the ML feature of the Elastic Stack works as a native plugin that obeys many of the same operational concepts. As described in the documentation (elastic.co/guide/en/elasticsearch/reference/current/ml-settings.html),...

 

Summary

To summarize, in this chapter, we covered the procedures around the enabling of Elastic ML's features in both a self-managed on-premises Elastic Stack and within the Elasticsearch Service of Elastic Cloud. Additionally, we looked under the hood to see the deep integration points with the rest of the Elastic Stack and how Elastic ML works from an operational perspective.

As we look ahead to future chapters, the focus will now shift away from the conceptual and background information into the realm of practical usage. Starting with the next chapter, we will jump right into the comprehensive capabilities of Elastic ML's anomaly detection and we will learn how to configure jobs to solve some practical use cases in log analytics, metric analysis, and user behavior analytics.

About the Authors

  • Rich Collier

    Rich Collier is a solutions architect at Elastic. Joining the Elastic team from the Prelert acquisition, Rich has over 20 years' experience as a solutions architect and pre-sales systems engineer for software, hardware, and service-based solutions. Rich's technical specialties include big data analytics, machine learning, anomaly detection, threat detection, security operations, application performance management, web applications, and contact center technologies. Rich is based in Boston, Massachusetts.

    Browse publications by this author
  • Camilla Montonen

    Camilla Montonen is a Senior Machine Learning Engineer at Elastic.

    Browse publications by this author
  • Bahaaldine Azarmi

    Bahaaldine Azarmi, or Baha for short, is the head of solutions architecture in the EMEA South region at Elastic. Prior to this position, Baha co-founded ReachFive, a marketing data platform focused on user behavior and social analytics. He has also worked for a number of different software vendors, including Talend and Oracle, where he held positions as a solutions architect and architect. Prior to Machine Learning with the Elastic Stack, Baha authored books including Learning Kibana 5.0, Scalable Big Data Architecture, and Talend for Big Data. He is based in Paris and holds an MSc in computer science from Polytech'Paris.

    Browse publications by this author
Book Title
Unlock this book and the full library for only $5/m
Access now