Reader small image

You're reading from  Splunk Essentials - Second Edition

Product typeBook
Published inSep 2016
Publisher
ISBN-139781785889462
Edition2nd Edition
Tools
Right arrow
Authors (3):
Betsy Page Sigman
Betsy Page Sigman
author image
Betsy Page Sigman

Betsy Page Sigman is a distinguished professor at the McDonough School of Business at Georgetown University in Washington, D.C. She has taught courses in statistics, project management, databases, and electronic commerce for the last 16 years, and has been recognized with awards for teaching and service. She has also worked at George Mason University in the past. Her recent publications include a Harvard Business case study and a Harvard Business review article. Additionally, she is a frequent media commentator on technological issues and big data.
Read more about Betsy Page Sigman

Somesh Soni
Somesh Soni
author image
Somesh Soni

Somesh Soni is a Splunk Consultant with over 11 years of IT experience. He has bachelor degree in Computer Science (Hons.) and has been a interested in exploring and learning new technologies throughout his whole life. He has extensive experience in Consulting, Architecture, Administration and Development in Splunk. He's proficient in various programming languages and tools including C#.NET/VB.NET, SSIS, and SQL Server etc. Somesh is currently working as a Splunk Master with Randstad Technologies. His activities are focused on Consulting, Implementation, Admin, Architecture and support related activities for Splunk. He started his career with the one of the Top 3 Indian IT giant He has executed projects for major fortune 500 companies like Coca-Cola, Wells Fargo, Microsoft, Capital Group etc. He has performed in various capacities of Technical Architect, Technical Lead, Onsite Coordinator, Technology Analyst etc. Somesh has been a great contributor in the Splunk Community work and has consistently been on the top of the list. He is a member of Splunk Trust 2015-16 and overall one of the topmost contributor to Splunk Answers community. Acknowledgement: I would like to thank my family and colleagues who have always encouraged and supported me to follow my dreams, my friends who put up with all my crazy antics while I went on a Splunk exploratory Journey and listened with patience on all the tips and tricks of Splunk which I shared with them. Last but not the least I would like to express my gratitude to the entire team of Packt Publishing Ltd for giving me this opportunity.
Read more about Somesh Soni

Erickson Delgado
Erickson Delgado
author image
Erickson Delgado

Erickson Delgado is an enterprise architect who loves to mine and analyze data. He began using Splunk in version 4.0 and has pioneered the use of the application in his current work. In the earlier parts of his career, he worked with start-up companies in the Philippines to help build their open source infrastructure. He then worked in the cruise industry as a shipboard IT manager, and he loved it. From there, he was recruited to work at the company's headquarters as a software engineer.
Read more about Erickson Delgado

View More author details
Right arrow

Chapter 5.  Data Optimization, Reports, Alerts, and Accelerating Searches

Finding the data that you need in Splunk is relatively easy, as you have seen in the previous chapters. Doing the same thing repeatedly, however, requires that you employ techniques that make data retrieval faster. In Chapter 2, Bringing in Data, you have been shown how to use data fields and to make field extractions. In Chapter 4, Data Models and Pivot, you learned how to create data models. You will continue that journey in this chapter by learning how to classify your data using event types, enrich your data using lookups and workflow actions, and normalize your data using tags.

Once you have all these essentials in place, you will be able to easily create reports, alerts, and dashboards. This is where Splunk really shines and your hard work so far will pay off.

In this chapter, we will cover a wide range of topics that showcase ways to manage, analyze, and get results from data. These topics will help you learn...

Data classification with event types


When you begin working with Splunk every day, you will quickly notice that many things are repeatable. In fact, while going through this book, you may have seen that search queries can easily get longer and more complex. One way to make things easier and shorten search queries is to create event types. Event types are not the same as events; an event is just a single instance of data. An event type is a grouping or classification of events that meet the same criteria.

If you took a break between chapters, you will probably want to open up Splunk again. Then you will execute a search command:

  1. Open up Splunk.

  2. Click on your Destinations app.

  3. Type in this query:

      SPL> index=main http_uri=/booking/confirmation http_status_code=200

This data will return successful booking confirmations. Now say you want to search for this the next day. Without any data classification, you'll have to type the same search string as previously. Instead of tedious repetition...

Data normalization with tags


Tags in Splunk are useful for grouping events with related field values. Unlike event types, which are based on specified search commands, tags are created and mapped to specific fields. You can also have multiple tags assigned to the same field, and each tag can be assigned to that field for a specific reason.

The simplest use-case scenario when using tags is for classifying IP addresses. In our Eventgen logs, three IP addresses are automatically generated. We will create tags against these IP addresses that would allow us to classify them based on different conditions:

IP address

Tags

10.2.1.33

main, patched, east

10.2.1.34

main, patched, west

10.2.1.35

backup, east

In our server farm of three servers, we are going to group them by purpose, patch status, and geolocation. We will achieve this using tags, as shown in the following steps:

  1. Begin by using the following search command:

          SPL> index=main server_ip=10.2.1.33
    
  2. Expand the first event by clicking...

Data enrichment with lookups


Occasionally you will come across pieces of data that you wish were rendered in a more readable manner. A common example is HTTP status codes. Computer engineers are often familiar with status codes as three-digit numbers. Business analysts, however, would not necessarily know the meaning of these codes. In Splunk, you solve this predicament by using lookup tables, which can pair numbers or acronyms with more understandable text classifiers.

A lookup table is a mapping of keys and values that Splunk can query so it can translate fields into more meaningful information at search time. This is best understood through an example. You can go through the following steps:

  1. From the Destinations app, click on Settings and then Lookups:

  2. In the Lookups page, click on the Add new option next to Lookup table files, as shown in the following screenshot:

  3. In the Add new page, make sure that the Destinations app is selected.

  4. Then, using the following screenshot as your guide, in...

Creating reports


So far in this chapter, you have learned how to do three very important things: classify data using event types, normalize data using tags, and enrich data using lookup tables. All these, in addition to Chapter 4, Data Models and Pivot, constitute the essential foundation you need to use Splunk in an efficient manner. Now it is time to put them all to good use.

Splunk reports are reusable searches that can be shared to others or saved as a dashboard. Reports can also be scheduled periodically to perform an action, for example to be sent out as an e-mail. Reports can also be configured to display search results in a statistical table, as well as visualization charts. You can create a report through the search command line or through a Pivot. Here we will create a report using the search command line:

  1. In the Destinations app's search page, type in this command:

          SPL> eventtype=bad_logins | top client_ip
    

    The search is trying to find all client IP addresses that attempted...

Creating alerts


Alerts are crucial in IT operations. They provide real-time awareness of the state of the systems. Alerts also enable you to act fast when an issue has been detected prior to waiting for a user to report it. Sure enough, you can have a couple of data center operators monitor your dashboards, but nothing jolts their vigil more than an informative alert.

Now, alerts are only good if they are controlled and if they provide enough actionable information. Splunk allows you to do just that. In this section, we will walk you through how to create an actionable alert and how to throttle the alerting to avoid flooding your mailbox.

The exercises in this section will show you how to create an alert, but in order to generate the actual e-mail alert, you will need a mail server. This book will not cover mail servers but the process of creating the alert will be shown in full detail.

We want to know when there are instances of a failed booking scenario. This event type was constructed with...

Search and report acceleration


In Chapter 4, Data Models and Pivot, you learned how to accelerate a data model to speed up retrieval of data. The same principle applies to saved searches or reports:

  1. Click on the Reports link in the navigation menu of the Destinations app.

  2. Click on the Edit | Edit Acceleration option in the Bookings Last 24 Hrs report.

  3. Enable 1 Day acceleration as seen in the following screenshot:

  4. To check the progress of your report's acceleration, click on Settings | Report Acceleration Summaries:

Scheduling best practices


No matter how advanced and well-scaled your Splunk infrastructure is, if all scheduled searches and reports are running at the same time, the system will start experiencing issues. Typically you will receive a Splunk message saying that you have reached the limit of concurrent or historical searches. Suffice to say that there are only a certain number of searches that can be run on CPU core for each Splunk instance. The very first issue a beginner Splunk admin faces is how to limit the number of concurrent searches running at the same time. One way to fix this is to throw more servers into the Splunk cluster, but that is not the efficient way.

The trick to establishing a robust system is to properly stagger and budget scheduled searches and reports. This means ensuring that they are not running at the same time. There are two ways to achieve this:

  • Time windows: The first way to ensure that searches are not running concurrently is to always set a time window. You have...

Summary indexing


In a matter of days, Splunk will accumulate data and start to move events into the cold bucket. If you recall, the cold bucket is where data is stored to disk. You will still be able to access this data but you are bound by the speed of the disk. Compound that with the millions of events that are typical with an enterprise Splunk implementation, and you can understand how your historical searches can slow down at an exponential rate.

There are two ways to circumvent this problem, one of which you have already performed: search acceleration and summary indexing.

With summary indexing, you run a scheduled search and output the results into an index called summary. The result will only show the computed statistics of the search. This results in a very small subset of data that will seemingly be faster to retrieve than going through the entirety of the events in the cold bucket.

Say, for example, you wish to keep track of all counts of an error in payment and you wish to keep the...

Summary


In this chapter, you have learned how to optimize data in three ways: classifying your data using event types, normalizing your data using tags, and enriching your data using lookup tables. You have also learned how to create advanced reports and alerts. You have accelerated your searches just like you did with data models. You have been introduced to the powerful Cron expression, which allows you to create granularity on your scheduled searches, and you have also been shown how to stagger your searches using time windows. Finally, you have created a summary index that allows you to search historical data faster. In the next chapter, Chapter 6, Panes of Glass, you will go on to learn more about how to do visualizations.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Splunk Essentials - Second Edition
Published in: Sep 2016Publisher: ISBN-13: 9781785889462
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Betsy Page Sigman

Betsy Page Sigman is a distinguished professor at the McDonough School of Business at Georgetown University in Washington, D.C. She has taught courses in statistics, project management, databases, and electronic commerce for the last 16 years, and has been recognized with awards for teaching and service. She has also worked at George Mason University in the past. Her recent publications include a Harvard Business case study and a Harvard Business review article. Additionally, she is a frequent media commentator on technological issues and big data.
Read more about Betsy Page Sigman

author image
Somesh Soni

Somesh Soni is a Splunk Consultant with over 11 years of IT experience. He has bachelor degree in Computer Science (Hons.) and has been a interested in exploring and learning new technologies throughout his whole life. He has extensive experience in Consulting, Architecture, Administration and Development in Splunk. He's proficient in various programming languages and tools including C#.NET/VB.NET, SSIS, and SQL Server etc. Somesh is currently working as a Splunk Master with Randstad Technologies. His activities are focused on Consulting, Implementation, Admin, Architecture and support related activities for Splunk. He started his career with the one of the Top 3 Indian IT giant He has executed projects for major fortune 500 companies like Coca-Cola, Wells Fargo, Microsoft, Capital Group etc. He has performed in various capacities of Technical Architect, Technical Lead, Onsite Coordinator, Technology Analyst etc. Somesh has been a great contributor in the Splunk Community work and has consistently been on the top of the list. He is a member of Splunk Trust 2015-16 and overall one of the topmost contributor to Splunk Answers community. Acknowledgement: I would like to thank my family and colleagues who have always encouraged and supported me to follow my dreams, my friends who put up with all my crazy antics while I went on a Splunk exploratory Journey and listened with patience on all the tips and tricks of Splunk which I shared with them. Last but not the least I would like to express my gratitude to the entire team of Packt Publishing Ltd for giving me this opportunity.
Read more about Somesh Soni

author image
Erickson Delgado

Erickson Delgado is an enterprise architect who loves to mine and analyze data. He began using Splunk in version 4.0 and has pioneered the use of the application in his current work. In the earlier parts of his career, he worked with start-up companies in the Philippines to help build their open source infrastructure. He then worked in the cruise industry as a shipboard IT manager, and he loved it. From there, he was recruited to work at the company's headquarters as a software engineer.
Read more about Erickson Delgado