Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Machine Learning with the Elastic Stack - Second Edition

You're reading from  Machine Learning with the Elastic Stack - Second Edition

Product type Book
Published in May 2021
Publisher Packt
ISBN-13 9781801070034
Pages 450 pages
Edition 2nd Edition
Languages
Authors (3):
Rich Collier Rich Collier
Profile icon Rich Collier
Camilla Montonen Camilla Montonen
Profile icon Camilla Montonen
Bahaaldine Azarmi Bahaaldine Azarmi
Profile icon Bahaaldine Azarmi
View More author details

Table of Contents (19) Chapters

Preface 1. Section 1 – Getting Started with Machine Learning with Elastic Stack
2. Chapter 1: Machine Learning for IT 3. Chapter 2: Enabling and Operationalization 4. Section 2 – Time Series Analysis – Anomaly Detection and Forecasting
5. Chapter 3: Anomaly Detection 6. Chapter 4: Forecasting 7. Chapter 5: Interpreting Results 8. Chapter 6: Alerting on ML Analysis 9. Chapter 7: AIOps and Root Cause Analysis 10. Chapter 8: Anomaly Detection in Other Elastic Stack Apps 11. Section 3 – Data Frame Analysis
12. Chapter 9: Introducing Data Frame Analytics 13. Chapter 10: Outlier Detection 14. Chapter 11: Classification Analysis 15. Chapter 12: Regression 16. Chapter 13: Inference 17. Other Books You May Enjoy Appendix: Anomaly Detection Tips

The advent of automated anomaly detection

ML, while a very broad topic that encompasses everything from self-driving cars to game-winning computer programs, was a natural place to look for a solution. If you realize that most of the requirements of effective application monitoring or security threat hunting are merely variations on the theme of find me something that is different from normal, then the discipline of anomaly detection emerges as the natural place to begin using ML techniques to solve these problems for IT professionals.

The science of anomaly detection is certainly nothing new, however. Many very smart people have researched and employed a variety of algorithms and techniques for many years. However, the practical application of anomaly detection for IT data poses some interesting constraints that make the otherwise academically worthy algorithms inappropriate for the job. These include the following:

  • Timeliness: Notification of an outage, breach, or other significant anomalous situation should be known as quickly as possible to mitigate it. The cost of downtime or the risk of a continued security compromise is minimized if remedied or contained quickly. Algorithms that cannot keep up with the real-time nature of today's IT data have limited value.
  • Scalability: As mentioned earlier, the volume, velocity, and variation of IT data continue to explode in modern IT environments. Algorithms that inspect this vast data must be able to scale linearly with the data to be usable in a practical sense.
  • Efficiency: IT budgets are often highly scrutinized for wasteful spending, and many organizations are constantly being asked to do more with less. Tacking on an additional fleet of super-computers to run algorithms is not practical. Rather, modest commodity hardware with typical specifications must be able to be employed as part of the solution.
  • Generalizability: While highly specialized data science is often the best way to solve a specific information problem, the diversity of data in IT environments drives a need for something that can be broadly applicable across most use cases. Reusability of the same techniques is much more cost-effective in the long run.
  • Adaptability: Ever-changing IT environments will quickly render a brittle algorithm useless in no time. Training and retraining the ML model would only introduce yet another time-wasting venture that cannot be afforded.
  • Accuracy: We already know that alert fatigue from legacy threshold and rule-based systems is a real problem. Swapping one false alarm generator for another will not impress anyone.
  • Ease of use: Even if all of the previously mentioned constraints could be satisfied, any solution that requires an army of data scientists to implement it would be too costly and would be disqualified immediately.

So, now we are getting to the real meat of the challenge—creating a fast, scalable, accurate, low-cost anomaly detection solution that everyone will use and love because it works flawlessly. No problem!

As daunting as that sounds, Prelert founder and CTO Steve Dodson took on that challenge back in 2010. While Dodson certainly brought his academic chops to the table, the technology that would eventually become Elastic ML had its genesis in the throes of trying to solve real IT application problems—the first being a pesky intermittent outage in a trading platform at a major London finance company. Dodson, and a handful of engineers who joined the venture, helped the bank's team use the anomaly detection technology to automatically surface only the needles in the haystacks that allowed the analysts to focus on the small set of relevant metrics and log messages that were going awry. The identification of the root cause (a failing service whose recovery caused a cascade of subsequent network problems that wreaked havoc) ultimately brought stability to the application and prevented the need for the bank to spend lots of money on the previous solution, which was an unplanned, costly network upgrade.

As time passed, however, it became clear that even that initial success was only the beginning. A few years and a few thousand real-world use cases later, the marriage of Prelert and Elastic was a natural one—a combination of a platform making big data easily accessible and technology that helped overcome the limitations of human analysis.

Fast forward to 2021, a full 5 years after the joining of forces, and Elastic ML has come a long way in the maturation and expansion of capabilities of the ML platform. This second edition of the book encapsulates the updates made to Elastic ML over the years, including the introduction of integrations into several of the Elastic solutions around observability and security. This second edition also includes the introduction of "data frame analytics," which is discussed extensively in the third part of the book. In order to get a grounded, innate understanding of how Elastic ML works, we first need to get to grips with some terminology and concepts to understand things further.

You have been reading a chapter from
Machine Learning with the Elastic Stack - Second Edition
Published in: May 2021 Publisher: Packt ISBN-13: 9781801070034
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}