Reader small image

You're reading from  Splunk 9.x Enterprise Certified Admin Guide

Product typeBook
Published inAug 2023
PublisherPackt
ISBN-139781803230238
Edition1st Edition
Right arrow
Author (1)
Srikanth Yarlagadda
Srikanth Yarlagadda
author image
Srikanth Yarlagadda

Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.
Read more about Srikanth Yarlagadda

Right arrow

Splunk Index Management

Indexes are repositories of data. Splunk Enterprise stores data as events in indexes. An event refers to a single data record or log entry. It could be a line from a log file, a message from a network source, or any piece of information that is indexed and processed by Splunk. So far in this book, we have seen the forwarders used to monitor and forward data to indexers. You must be wondering how data is processed and where it is stored in the indexer component. In this chapter, you will get the answers you are looking for. It is crucial for system administrators to know about indexes as they organize the creation, management, access control, and storage estimations of indexes in their day-to-day work.

We will begin by learning about Splunk indexes, including default indexes, and how data is organized into buckets with retention policies. After you are familiar with the core concepts, we will move on to bucket types and their rollover behavior, followed by...

Understanding Splunk indexes

An index is a specific type of data storage inside Splunk Enterprise; in other words, to keep it simple, an index is a repository of data. For example, you are searching for your address on a form online and you input your address details: you provide your house number, your street, and your postcode, which is unique to you. Similarly, if you want to search for specific data in Splunk, you can find it by using its index.

There are two types of Splunk indexes. They are called event indexes and metrics indexes. Event indexes store any type of text data, and this is the default index type. Metrics indexes only store metrics data, which must comply with a defined structure. There are special commands in Splunk, usually prefixed with m such as mstats, mpreview, and mcatalog, for working with metrics data. Metrics indexes are a completely different topic and beyond the scope of this book. If you would like to read more about them, please visit https://tinyurl...

Understanding buckets

Buckets are an integral part of indexes; they contain raw data and index files. They are organized in the form of folders on a filesystem with a specific naming pattern. These folders are explicitly used by the Splunk indexer for data storage and search processing.

In order to learn a bit more about them, let’s look at the default _introspection index folder structure:

Figure 5.1: Splunk non-clustered index folder structure

Figure 5.1: Splunk non-clustered index folder structure

Figure 5.1 shows the _introspection index inside the $SPLUNK_DB path. The naming convention is only applicable to non-clustered indexers. Let’s take a look at the indexes.conf file located in the $SPLUNK_HOME/etc/system/default/ directory. It contains the _introspection index settings that correlate with the folder structure in Figure 5.1:

# indexes.conf - _introspection internal index settings[_introspection]
homePath = $SPLUNK_DB\_introspection\db
coldPath = $SPLUNK_DB\_introspection\colddb...

Creating Splunk indexes

Creating a custom index is a Splunk administrator’s responsibility. As you have seen, data is retained in indexes until it reaches a certain age or the index reaches a certain size, so an estimation of an index’s size before it is created is a crucial step. Having an accurate estimation retains the data for the necessary amount of time, and it also helps to determine the required storage capacity upfront.

In order to estimate the size of the index, we must have the answers to the following questions:

  • Question: How long does the data need to be retained in days?

Answer: For example, 100 days

  • Question: How much data volume per day is expected in GB?

Answer: For example, 1 GB

The formula for index size estimation is as follows:

Retain days X volume per day X 1/2 (raw data compression + index files)

If we substitute in the example values, the index size becomes 100 X 1 X (½) = 50 GB.

Now that you know...

Backing up indexes

So far, we have created indexes and stored information in buckets, and modified their configurations in the .conf file. What if the underlying hardware fails and you have critical data and files that cannot be lost? To restore your Splunk instance to its original state, a backup procedure must be set up. For the certification exam, it is important to know which folders of the Splunk installation to back up. The following two essential folders need to be backed up:

  • The $SPLUNK_DB directory: $SPLUNK_HOME/var/lib/splunk/

Hot buckets cannot be backed up while Splunk is running; instead, a snapshot can be taken incrementally, and you can take a backup of the snapshot

  • The $SPLUNK_HOME/etc directory: This contains apps, user configurations, system configuration files, and licenses

In Splunk, the $SPLUNK_HOME/etc directory is a critical directory that contains configuration files and settings that control the behavior of the entire Splunk deployment...

Monitoring Splunk indexes

As an administrator, it can be overwhelming to monitor several indexes as they grow in number. To address this, Splunk monitors indexes through a default monitoring console app. It has many monitoring features, and monitoring indexes is one of them. Let’s take a look at its menu options.

Log in as an administrator and navigate to Settings in the top right. Click on Monitoring Console. The console opens in the following web view. Click on Indexing and select Index Detail: Instance.

Figure 5.4: Indexes dashboard in the monitoring console

Figure 5.4: Indexes dashboard in the monitoring console

As you can see in Figure 5.4, Splunk is a standalone Windows-based instance that has three dashboards for monitoring indexes and their volumes. The Index Detail: Instance dashboard has the following settings:

Figure 5.5: Index Detail: Instance dashboard

Figure 5.5: Index Detail: Instance dashboard

The dashboard has drop-down options for Instance (also called the Splunk indexer), and the index that has...

Summary

Index is a name given to a specific data repository in Splunk. An index can be configured with simple basic settings and some advanced settings as it grows larger. We have learned that there are two index types: event indexes and metrics indexes. Event indexes can store any text data, whereas metrics indexes store data that follows a specific metric structure. You learned about destructive commands, delete and clean. These commands should always be used with extreme caution and should only be executed when you are absolutely certain of their implications. To delete an index, we can use a special can_delete role, which applies delete markers to data without removing data from storage. Through the CLI, data can be deleted permanently using Splunk’s clean command.

We also explored the role of Splunk indexers as crucial components of the Splunk architecture. Indexers are responsible for processing and indexing data, storing it efficiently, and responding to search requests...

Self-assessment

You will be given 10 multiple-choice questions. The question patterns are the same as discussed in the Introducing the exam’s test pattern section of Chapter 1, Getting Started with the Splunk Enterprise Certified Admin Exam. When you have finished, refer to the sections with which you are having difficulty. Alternatively, you can refer to the Splunk documentation. All the best! Let’s get started.

  1. An index is a repository of data, and it has a name. Is this statement true or false?.
    1. True
    2. False
  2. An indexer is a Splunk instance. What is its role in a Splunk deployment? (Choose all that apply)
    1. The indexer stores data.
    2. The indexer stores data in $SPLUNK_HOME location.
    3. The indexer responds to search requests issued by the search head.
    4. Data stored in the indexer cannot be deleted.
    5. The indexer by default comes with internal indexes. You can create custom indexes too.
  3. Do metrics indexes store any type of text data?
    1. Yes
    2. No
  4. Choose the Splunk indexes that...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Splunk 9.x Enterprise Certified Admin Guide
Published in: Aug 2023Publisher: PacktISBN-13: 9781803230238
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Srikanth Yarlagadda

Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.
Read more about Srikanth Yarlagadda