Home Data Splunk 9.x Enterprise Certified Admin Guide

Splunk 9.x Enterprise Certified Admin Guide

By Srikanth Yarlagadda
ai-assist-svg-icon Book + AI Assistant
eBook + AI Assistant $27.99
Print $34.99
Subscription $15.99 $10 p/m for three months
ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription.
ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription. $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime! ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription.
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Along with your eBook purchase, enjoy AI Assistant (beta) access in our online reader for a personalized, interactive reading experience.
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription. ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription. BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime! ai-assist-svg-icon NEW: AI Assistant (beta) Available with eBook, Print, and Subscription.
eBook + AI Assistant $27.99
Print $34.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
Gain access to our AI Assistant (beta) for an exclusive selection of 500 books, available during your subscription period. Enjoy a personalized, interactive, and narrative experience to engage with the book content on a deeper level.
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Along with your eBook purchase, enjoy AI Assistant (beta) access in our online reader for a personalized, interactive reading experience.
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Chapter 1: Getting Started with the Splunk Enterprise Certified Admin Exam
About this book
The IT sector's appetite for Splunk and skilled Splunk developers continues to surge, offering more opportunities for developers with each passing decade. If you want to enhance your career as a Splunk Enterprise administrator, then Splunk 9.x Enterprise Certified Admin Guide will not only aid you in excelling on your exam but also pave the way for a successful career. You’ll begin with an overview of Splunk Enterprise, including installation, license management, user management, and forwarder management. Additionally, you’ll delve into indexes management, including the creation and management of indexes used to store data in Splunk. You’ll also uncover config files, which are used to configure various settings and components in Splunk. As you advance, you’ll explore data administration, including data inputs, which are used to collect data from various sources, such as log files, network protocols (TCP/UDP), APIs, and agentless inputs (HEC). You’ll also discover search-time and index-time field extraction, used to create reports and visualizations, and help make the data in Splunk more searchable and accessible. The self-assessment questions and answers at the end of each chapter will help you gauge your understanding. By the end of this book, you’ll be well versed in all the topics required to pass the Splunk Enterprise Admin exam and use Splunk features effectively.
Publication date:
August 2023
Publisher
Packt
Pages
256
ISBN
9781803230238

 

Getting Started with the Splunk Enterprise Certified Admin Exam

Let’s get started with Splunk Enterprise. By the end of this chapter, you should understand what Splunk Enterprise is and its rich set of features and be able to list the Splunk components that work together to get business insights out of data. You will also learn about the installation of standalone Splunk Enterprise in a Windows environment, along with advanced Splunk Validated Architectures (SVAs) covering all the Splunk components. Throughout the book, you’ll often find us using the terms Splunk Enterprise and Splunk interchangeably. They both refer to the product itself. You will rarely find references to Splunk Inc., which refers to the company that developed and offers the Splunk Enterprise product.

This chapter covers the following topics to get you started:

  • Introducing the certification exam
  • The weightage of topics in the exam
  • Introducing the exam’s test pattern
  • What is Splunk Enterprise?
  • Introducing Splunk 9.x Enterprise features
  • Understanding Splunk components
  • SVAs
  • Splunk installation—standalone
  • Self-assessment
 

Introducing the certification exam

The Splunk Enterprise Admin exam is the prerequisite to attain the Splunk Enterprise Certified Admin certification. The exam contains 56 questions that you need to answer in 57 minutes, and you will get an extra 3 minutes to review your answers, bringing the duration of the exam to a total of 60 minutes. Successful candidates will be issued a digital certificate along with Splunk digital badges. In order to be eligible to sit the Splunk Enterprise Admin certification exam, you should have already passed the Splunk Core Certified Power User exam and obtained that certification.

The exam tests your knowledge of Splunk Enterprise system administration and Splunk data administration concepts. Splunk Education and/or Splunk Authorized Learning Partners (ALPs) offer administration courses through instructor-led training along with material, labs, and sample questions. Splunk recommends going through these training sessions. They are paid courses. However, do note that taking part in this training is optional for the admin exam. This book covers both system and data administration concepts along with self-assessment questions on each topic, for you to get ready for the exam.

A Splunk Enterprise system administrator is someone who looks after the Splunk Enterprise platform on a day-to-day basis. This exam tests your knowledge of user management, installation, the configuration of Splunk Enterprise, forwarder management, license management, search head (SH) management, index creation, indexer management, and monitoring the whole Splunk platform using the Monitoring Console (MC).

Splunk Enterprise data administrator responsibilities include getting the data into Splunk from various sources, such as data inputs leveraging the universal forwarder (UF), network inputs, scripted inputs, and Technology Add-ons (TAs). The data admin ensures the data is correctly broken down into individual events, applying timestamps and setting sourcetype and other metadata fields. In addition, they can create knowledge objects required to support other Splunk features for data insights and data retrieval using the Splunk Search Processing Language (SPL).

The following section explains the weightage of exam questions per topic that are asked.

 

The weightage of topics in the exam

A list of topics in scope and their weightage has been provided by Splunk in its test blueprint for the admin exam. The topics might be slightly updated by Splunk in the future. At the time of writing this book, these are current and valid for the Splunk Enterprise 9.x Certified Admin exam.

Refer to the latest blueprint prior to booking your exam and find out whether any new concepts have been included. You could try accessing this blueprint using this link: https://tinyurl.com/36x7apnr. Otherwise, if the web link changes, look for the blueprint PDF deep link in the Splunk Certification Exams Study Guide (https://www.splunk.com/pdfs/training/splunk-certification-exams-study-guide.pdf) on the Splunk Enterprise Certified Admin page.

Don’t be alarmed by the length of the topic list; the topics are covered in thorough detail in the rest of this book, to get you prepared with confidence.

Now that you have an idea of the topics and their weightage, let’s understand the exam’s test pattern.

 

Introducing the exam’s test pattern

The exam contains 56 questions to be answered in 57 minutes. Each question has at most five options. Some of the questions will have more than one answer, under the Select all that apply category. Others are either true or false or single-answer.

The following are sample questions of the different categories with answers.

True or false category

Q. Splunk Enterprise is only able to store and retrieve text-based data.

  1. True
  2. False

Here, the answer is option A.

Single-answer category

Q. A UF is sending data to index=linux_os, which does not exist on the indexer layer. What happens to the data in this scenario?

  1. Since no such index has been configured, the data will be ignored by the indexer
  2. The indexer throws an error message to the UF
  3. A linux_os index is automatically created since it did not exist before
  4. The data gets stored in the lostandfound index

Here, the answer is option A.

Multiple-choice category

Q. A Splunk admin user has, by default, which capabilities? (Select all that apply)

  1. Admin can install the UF remotely
  2. Admin can create another admin user
  3. Admin can create a custom role for a group of non-admin users
  4. Admin can restart a Splunk SH instance through the GUI

Here, the answers are options B, C, and D.

Let’s get started with learning about Splunk Enterprise in the following section.

 

What is Splunk Enterprise?

Splunk Enterprise is software that collects data from heterogeneous sources and provides interfaces to analyze machine data. Getting to know Splunk Enterprise helps you to choose the right feature for the needs or requirements that will come through while you are working on real-time projects. As an administrator, it is highly expected that you are well aware of these capabilities of Splunk. Key features of this product are explained as follows:

  • Collecting text data: Splunk Enterprise can only collect and search text data. Non-textual data should not be stored in Splunk Enterprise.
  • Schemaless: Splunk accepts structured, semi-structured, and unstructured data, and no strict checking of schema compliance is needed.
  • Web, command-line interface (CLI), and REST application programming interface (API) interfaces: Three standard interfaces are offered by Splunk—web for searching, reporting, alerting, and configuration management; REST API to enable all the web functions through programmatic access; and Splunk CLI for executing system commands, configuring Splunk, and running searches. In general, Splunk Administrators use this interface.
  • Searching, reporting, and alerting: To query Splunk Enterprise, it has introduced a proprietary SPL, which is used in every interface it offers to retrieve the data from it. Searching enables data retrieval, which could be ad hoc or scheduled to run at a particular time of the day. Reporting involves a reusable search query that is stored and can be scheduled or run on demand. Finally, alerting is a scheduled search and triggers a defined set of actions when a given condition is met—an alert action could involve tasks such as sending an email or executing a script.
  • Anonymizing data: Data can contain sensitive information, such as Personally Identifiable Information (PII) and Payment Card Industry (PCI) data. For example, credit card numbers and user phone numbers are highly classified and restricted to only being visible or accessible to a particular group of employees, which is broadly called data sovereignty. To comply with the data standards of an organization, Splunk offers the capability to mask or hide this data during indexing. This prevents users that are querying Splunk Enterprise from discovering this sensitive information. We will study this further in Chapter 10, Data Parsing and Transformation, specifically in the Data Anonymization section.
  • Scaling from single to distributed deployment: Splunk Enterprise is designed to accommodate various deployment sizes spanning from individual server configurations to extensive distributed setups. It excels in handling substantial data processing tasks and user support efficiently, even when dealing with data volumes in the realm of petabytes.
  • High availability (HA) and disaster recovery (DR): Clustering refers to a group of Splunk instances that work together to enable HA. Multi-site clustering refers to geographically redundant clusters working together for DR. All clustering instances share common configurations through replication.
  • Data collection mechanisms: Getting data into Splunk is a crucial stage that is a continuous process in large enterprises that comprise various data sources. Splunk provides a UF agent for file monitoring, network inputs, scripted inputs, and HTTP Event Collector (HEC) for agentless scenarios. Similarly, it provides TAs for collecting data from Linux, Windows, the cloud, CRM, and network devices, and so on. Add-ons are available on the Splunk website (https://splunkbase.splunk.com).
  • Monitoring: The MC application functions as a tool for effectively supervising the Splunk platform. It offers insights into the performance of both standalone and distributed Splunk deployments. The application includes preconfigured alerts and dashboards that can be enabled to ensure proactive monitoring of the platform's overall health and performance.

Let’s look at the newly introduced features in version 9.x of Splunk in the following section.

 

Introducing Splunk Enterprise 9.x features

Splunk Enterprise has evolved over the years and currently stands at version 9.0.3 at the time of writing this book. As it gets more advanced, some of its features become deprecated and new features are added or enhanced. Older versions often reach end of life (EOL), which means Splunk won’t offer support or fix bugs; instead, it advises upgrading to the latest version.

This section covers the important features of Splunk version 8.x that have been carried forward to the latest 9.0 product version, along with new features introduced in the 9.x version. These features are good to be aware of but are not tested in the exam. Feel free to skip this section if you want to:

  • Dashboard Studio: This provides the necessary tools to create visualizations, such as graphs, charts, and statistical tables, with colors and images. It complements the classic simple XML dashboard that existed in previous versions of Splunk but does not replace it as of version 8.2.6.
  • Federated search: This is used to search remote Splunk deployments that are outside of the local Splunk deployment. Local SH initiates search requests to remote SH, which acts as a federation provider. Remote deployment could consist of a single SH or cluster.
  • Health report: Splunk Web has a handy Health status of Splunk report that displays the health of Splunk processes in green, red, and yellow states. Selecting each process further drills down into the detailed information. The health report helps admins to get a quick understanding of the platform status, such as I/O wait, ingestion latency, data durability, search lag, disk space, and skipped searches.
  • Durable Search: Scheduled reports that require the results to be complete for each scheduled run can be enabled to rerun at a later point in time when all the necessary resources are available to finish the job. That’s called a durable search. A scheduled report could return partial/incomplete results due to a number of reasons. For example, a search peer might be busy servicing other requests and have exhausted its resources (CPU, memory, and so on). Another scenario is where SH-to-indexer network connectivity is unstable. However, with the durable search feature, the scheduler ensures it will rerun the same report at a later point in time for the same window it was supposed to execute and return complete results for. So far, we have gone through the features of the 8.2.x product family. Later sections explain the version 9.0 features.
  • SmartStore Azure Blob support: SmartStore is a Splunk concept referring to an indexer feature for storing data in remote object storage. In previous versions such as 8.2.X, SmartStore had support for Amazon Web Services (AWS) Simple Storage Service (S3) object storage and Google Cloud Platform’s (GCP’s) Google Cloud Storage (GCS). Starting from 9.0, it also has support for Azure Blob storage.
  • Ingest actions: Splunk 9.0 introduced Ingest Actions for data administrators with a new UI. It can do data masking, data filtering, and routing through rulesets. It is a cool feature, changing the way data admins traditionally write transform configurations for masking, filtering, and routing. Data could be routed to external S3 object storage and/or to an index. The new data preview mode allows uploading sample data of up to 5 GB for live testing.
  • Splunk Assist: Splunk Assist is an app built for the Splunk cloud offering. It is a fully managed service by Splunk Inc. Starting from version 9.0, the app is available for Splunk Enterprise (on-premises) customers. It provides deep insights to admins regarding Splunk deployment configuration recommendations, evaluating the security posture, making updates to Splunkbase apps, and much more.
  • Cluster Manager (CM) redundancy: In previous versions such as 8.x.x, there used to be only a single CM for an indexer cluster. Starting with version 9.0, we can configure a second CM and run it in standby mode. Two managers run in an active/standby configuration; when the active manager is down, the standby manager will be active to rescue the whole cluster.
  • Config tracker: A new internal index, _configtracker, has been introduced to track config files and their stanzas, including key-value pairs. This is a cool new feature that helps to troubleshoot config issues and find who, when, and what changed from an audit perspective.
  • To go through the complete list of features for previous versions of the 8.x.x family, follow this link and choose the version:

    https://docs.splunk.com/Documentation/Splunk/8.2.10/ReleaseNotes/MeetSplunk

    Similarly, a full list of 9.0.X features is available here:

    https://docs.splunk.com/Documentation/Splunk/9.0.3/ReleaseNotes/MeetSplunk

In the next section, we will learn about Splunk Enterprise components.

 

Understanding Splunk components

Splunk Enterprise has multiple integral components that work together and are primarily divided based on their functions. The list is very comprehensive. A standalone Splunk deployment doesn’t require all the components; however, a distributed and highly available deployment requires almost all of them.

A detailed understanding of standalone versus distributed deployment is covered in the following section of this chapter, Splunk Validated Architectures (SVAs). By the end of this section, you will be familiar with two types of Splunk components—namely, processing components and management components.

Processing components

The following are processing components:

  • Forwarder
  • SH
  • Indexer

Let’s understand the roles of these components in detail and their association with management components.

Forwarder

As the name suggests, this primarily forwards data from the source to the target indexer. There are two types of forwarders:

  • Universal Forwarder (UF)
  • Heavy Forwarder (HF)

UF is a software agent typically installed on the source system where data is being generated. It consists of an input configuration (that is, an inputs.conf file) with a list of absolute file paths along with metadata fields such as index and sourcetype. UF is the preferred approach to monitoring and forwarding file contents to designated indexers. By default, UF makes use of the fishbucket process to forward data for indexing exactly once and avoids data duplication through cyclic redundancy checks (CRCs) and seek pointers. You will find further information about the additional supported data inputs and detailed explanations about the fishbucket concept in Chapter 9, Configuring Splunk Data Inputs.

The following diagram illustrates UF installed on a web server configured to monitor the web server logs and forward them continuously to the indexer as and when the logs get updated:

Figure 1.1: UF forwarding web server logs to indexer

Figure 1.1: UF forwarding web server logs to indexer

Let us now look at SH, which is a critical user-facing processing component in a distributed deployment.

HF is a Splunk Enterprise instance and doesn't require separate binary for installation. It provides an extended feature set compared to a UF. It not only collects and forwards data, but also includes a Splunk user interface for configuration and management. To operate an HF, a forwarder license is required. Typically, an HF is configured in forwarding mode by disabling local data storage. Splunk Add-ons available on Splunkbase can be installed on an HF to facilitate data collection from various sources. This combination of features makes HFs a versatile choice for preprocessing and forwarding data while benefiting from a user-friendly interface.

SH

The SH component is a Splunk Enterprise instance that is dedicated to search management and provides a number of interfaces for users to interact with. The popular interfaces it offers to users are web, CLI, and RESTful API.

Multiple SHs can be grouped together and form a cluster called a SH cluster (SHC). Members of an SHC share the same baseline configuration, and jobs are allocated to available members by the SH captain.

In a standalone deployment, a single Splunk Enterprise instance (that is, the same instance) works as both the SH and indexer. In a distributed deployment model, the SH or SHC can submit searches to multiple indexers and consolidate the results returned. The results are stored locally in a dispatch directory located in $SPLUNK_HOME/var/run/splunk/dispatch for later retrieval, and the results will be deleted after the job expires. $SPLUNK_HOME refers to the installation directory where the Splunk software is installed. For example, ad hoc search results (that is, the search job outcome) are retained for 10 minutes in the dispatch directory, which will be removed after the job expires by a process called the dispatch reaper, which runs every 30 seconds.

SH stores search-time knowledge objects that work directly on raw data and/or fields being returned from the indexer—for example, knowledge objects such as field extractions, alerts, reports, dashboards, and macros are categorized as search-time knowledge objects in Splunk.

The following diagram illustrates a distributed deployment configuration featuring a single dedicated SH that communicates with three separate indexers when executing a search query:

Figure 1.2: SH and indexers interaction

Figure 1.2: SH and indexers interaction

Let us look at another critical processing component—the indexer, which is also called a search peer, as it responds to queries issued by the SH.

Indexer

The indexer accepts and stores the indexed data, which can be retrieved later when requested by the SH. The sources of data transmission can include forwarder agents or inputs without requiring dedicated agents. The indexer(s) can be set up as either standalone instances or as a clustered configuration for HA. The data that has been indexed remains unchangeable and is stored in the form of buckets. More details about buckets are provided in Chapter 5, Splunk Index Management:

Figure 1.3: Indexers receiving data from forwarders and storing it in indexes

Figure 1.3: Indexers receiving data from forwarders and storing it in indexes

So far, we have gone through the processing components and their roles in a Splunk Enterprise deployment. Let us go through the management components in the following section.

Management components

These are management components that support the processing components:

  • Deployment Server (DS)
  • SHC Deployer (SHC-D)
  • Indexer CM
  • License Manager (LM)
  • MC

We’ll discuss them in the following subsections.

DS

A standalone Splunk Enterprise instance is used to manage the forwarders. The forwarders, which are located at the data source (typically a UF), often need new configurations to monitor new files or changes to an existing configuration followed by an optional restart. Changing them manually is a very time-consuming task in larger infrastructures. That’s where the DS comes to the rescue, by maintaining a central repository of configurations in the form of apps. In addition to UFs, HFs can also be centrally managed using a DS.

Chapter 4, Splunk Forwarder Management, goes through more details on this topic.

SHC-D

The SHC-D manages app configurations and deployments for an SHC in Splunk Enterprise deployment. It distributes app bundles to the SHs, applies configurations, and coordinates rolling restarts if needed.

The SHC-D usually stores all the apps at the following location: $SPLUNK_HOME/etc/shcluster/apps.

Indexer CM

An indexer cluster incorporates a distinct Splunk Enterprise instance that functions as a Cluster manager, known as a CM. This CM does not engage in typical search operations but rather oversees the indexer cluster, governing it in the following ways:

  • The Replication Factor (RF) is met
  • The Search Factor (SF) is met
  • Deployment of configurations to the cluster
  • Responds to SH requests

The Search head indexer clustering overview section of Chapter 7 will explain the RF and SF in detail.

License manager

All components in Splunk Enterprise require a license for commercial use, except for UF, which is a software offered by Splunk that is available for use without requiring a separate license. The LM is loaded with the license file received from Splunk sales by an admin. Multiple license files might exist depending on the agreement with Splunk. The rest of the instances in the deployment, called license peers, are connected to the manager node. The manager node acts as a central license repository for configuring stacks, pools, and license volumes. It stores usage logs in a license_usage.log file, which tracks all Splunk instances connected to the LM for violations and their usage. Out-of-the-box license reports are dependent on this log. We will discuss this in detail in Chapter 2, Splunk License Management.

Monitoring Console

The MC is a built-in app in Splunk that provides a centralized location for monitoring and managing Splunk deployments. It offers a GUI that allows administrators to monitor and configure various aspects of Splunk, including alerts and dashboards for monitoring indexing, license usage, search, resource usage, forwarders, health checks, and more. We will go through some of these dashboards in detail and set up alerts in later chapters.

Note

Do note that although these components have dedicated roles and activities to perform, some of them can be installed together on the same Splunk instance. A matrix of which components can be combined is provided in the docs: https://tinyurl.com/26f9n5zf.

We have come to the end of the components section. We learned that a UF is preferred for file monitoring and forwarding data to indexers. Depending on the deployment type, whether standalone or distributed, the number of components required to set up differs. Standalone Splunk doesn’t require many components as it functions as both an SH and indexer. A distributed deployment includes a number of additional management components for deployment, cluster management, and license management. The Splunk Enterprise binary utilized for all components remains same; the differentiation lies in the configuration of each binary instance, determining the role of each component such as the SH, indexer, SHC-D, DS, or LM.

As we dive into the chapters associated with both processing and management components, we will look into these topics in more detail, and you will find them discussed a lot throughout the book. So, understanding these components and their role in Splunk Enterprise deployment is quite important to understand the rest of the sections and chapters.

 

Splunk Validated Architectures (SVAs)

This section is completely optional as this topic isn’t included in the Splunk admin exam blueprint; however, I recommend going through it to get an insight and familiarize yourself with what Splunk’s architecture looks like, as well as where the processing and management components are positioned and interconnected.

So far, we have learned about Splunk Enterprise’s features and components and their roles in a standalone or distributed deployment. It is time to see some of the deployment architectures, called SVAs, curated by the best minds at Splunk Inc.

Just as there is more than one solution to a problem, similarly, a single architecture might not fit every organization. For Splunk Enterprise architects and Splunk Enterprise admins who go through many variables and evaluate to come up with a suitable design, SVAs offer guidance with best practices and off-the-shelf readily available designs. A Splunk Enterprise architect’s roles and responsibilities vary from that of a typical admin. Splunk Education offers courses to prepare you to become a Splunk Enterprise-certified architect, and the Splunk Enterprise Admin certification is a prerequisite.

Let’s go through some of the prominent validated architectures of Splunk Enterprise on-premises. A full list of SVAs is available here: https://www.splunk.com/pdfs/technical-briefs/splunk-validated-architectures.pdf.

Single-server deployment

A single-server alias standalone deployment consists of a Splunk Enterprise instance that combines both SH and indexer functionality.

The following diagram shows the deployment architecture:

Figure 1.4: Standalone deployment architecture

Figure 1.4: Standalone deployment architecture

The diagram shows a standalone/single Splunk instance, a collection tier forwarding events to a single instance, and an optional DS to manage the collection tier/forwarders.

The only advantage of this deployment type is its cost-effective and easy to manage.

Let’s look at the limitations of this deployment type, as follows:

  • Works for a limited number of users, between four and eight
  • Data indexing size is limited to below 300 GB per day
  • Does not work effectively for critical searches
  • No high availability and disaster recovery
  • Migrating to distributed deployment is straightforward with additional hardware

Let’s take a look at distributed non-cluster deployment, which is a more advanced setup than a single-server deployment.

Distributed non-clustered deployment

Distributed non-clustered deployment works better for additional workload and indexing capacity than a single-server deployment. The separation of SH and indexing duties increases the total cost of ownership (TCO).

The following diagram shows the non-clustered deployment architecture with separate SH and indexing tiers:

Figure 1.5: Distributed non-clustered architecture

Figure 1.5: Distributed non-clustered architecture

In the depicted architecture, a separate search tier comprises a SHC and an indexing tier with multiple standalone indexers. The SHC-D is a mandatory management component responsible for deploying configurations to the SHC using apps. It facilitates the deployment process by pushing configuration updates via apps from the SHC-D to the SHC. A DS is utilized for managing forwarders, while an LM stores license information. The DS ensures effective forwarder management, while the LM serves as a central repository for license details, with all other instances connecting to it for license information. Let’s look at the advantages of this deployment over single-server deployment:

  • The number of users would be higher than in a single-server deployment with additional indexer support
  • Independent indexers increase the daily indexing capacity to over 300 GB

Now, let’s look at the limitations of this deployment:

  • No HA and DR
  • The SH needs to be reconfigured every time a new search peer/indexer is added
  • Search results might be incomplete when one of the indexers is down, and data ingestion might be impacted as well
  • In the case of a standalone single-SH deployment scenario, there is a single point of failure (SPOF)

Let’s take a look at distributed cluster deployment, which is a more advanced setup than a distributed non-cluster deployment.

Distributed cluster deployment and SHC – single-site

A distributed clustered SH and indexer deployment at a single site is a highly available, resilient architecture. A site is a classic data center in a particular region/geography.

The following diagram shows the clustered deployment architecture with a separate SHC and clustered indexing tier running on a single site:

Figure 1.6: Distributed clustered deployment and SHC – single-site

Figure 1.6: Distributed clustered deployment and SHC – single-site

Figure 1.6 shares similarities with Figure 1.5, as it depicts a similar architecture. However, in Figure 1.6, an additional management component, known as the CM, is introduced. The CM is responsible for overseeing and managing the indexer cluster, providing coordination and control of the cluster’s operation. It acts as a central point for configuring and monitoring the indexers within the cluster, ensuring their effective functioning and synchronization. Let’s understand the advantages of this over the two architectures we previously looked at:

  • Using an SHC avoids an outage in the case of node failure by replicating the configs and artifacts. Job scheduling is managed by the captain, which is one of the elected nodes from the cluster itself.
  • The indexer cluster enables data HA by maintaining redundant copies across the cluster. The CM is a separate management component that ensures data availability for searches.
  • The SHC scales by adding more nodes compared to a single instance, allowing for increased capacity in handling concurrent users and executing searches. Typically, each CPU is considered as one unit for search capacity calculation, meaning one CPU counts as one search. This allows for better scalability and improved performance in handling larger workloads and user demands.
  • Additional management components are the SHC-D and CM to aid in the deployment of apps to cluster members.
  • The Indexer Discovery feature aids the SHC to discover when a new search peer/indexer is added. The SHC doesn’t require reconfiguration for every new indexer node.

Now, let’s look at its limitations compared to the previous architectures:

  • No DR with a single site. A failure of a site will eventuate the failure of the entire deployment.
  • The SHC has a limitation of 100 nodes.
  • It increases the TCO and the management of the SHC and index cluster.

Let’s take a look at the multi-site distributed clustered deployment, which is a more advanced setup than distributed clustered deployment and single-site.

Distributed clustered deployment and SHC – multi-site

This is by far the most complex architecture valid for organizations that have strict HA and DR requirements. It has the same advantages as single-site architecture (as seen in the previous section), and the failure of a site doesn’t impact the entire deployment.

The following diagram shows a clustered deployment architecture with an SHC and clustered indexing tiers deployed in more than one site:

Figure 1.7: Distributed clustered deployment and SHC – multi-site

Figure 1.7: Distributed clustered deployment and SHC – multi-site

As in Figure 1.6, the components remain the same in each site. However, the collection tier is common across both sites. Each site has a dedicated SHC.

Let’s understand the limitations:

  • Indexer clusters replicate data between sites, which is called cross-site replication, requiring lower network latency. 100 milliseconds or less is preferred.
  • SHCs work independently and do not share artifacts and common configurations.
  • SHCs have a 100-node limitation per site.
  • A dedicated SHC-D is required for each site.
  • A single CM node suffices for an entire cluster of indexers across sites.

We’ve looked at a very basic single-server architecture (preferably used for testing or development) and an advanced multi-site cluster deployment architecture. Each has its advantages, limitations, and cost implications. At this stage, you are pretty much familiar with Splunk components and architectures. In the next section, we are going to install a standalone/single-server deployment, which we talked about at the very beginning of this section.

 

Splunk installation – standalone

As discussed in the preceding section, a single-server deployment consists of a single Splunk instance combining both SH and indexer functionality. The installation actually isn’t part of the admin exam blueprint; however, it is very helpful to get your hands dirty by experiencing Splunk yourself through the Splunk Web, configuration file (.conf), and CLI options that we are going to discuss in upcoming chapters. This section provides instructions for installing Splunk Enterprise 9.0.3 on the Windows operating system. Let's get into it.

Installation system requirements

Let’s look at the system requirements of the computing environment. Splunk Enterprise supports multiple operating system environments. A full list of the supported options is available here: https://tinyurl.com/2tuudjwr. Splunk has the following hardware requirements:

  • A 64-bit Linux or Windows distribution
  • 12 physical CPU cores or 24 vCPU @ 2 GHz or more clock speed per core
  • 12 GB random-access memory (RAM)
  • An x86 64-bit chip architecture
  • 1 GB Ethernet network interface card (NIC)
  • Free disk space of at least 3 GB for installation and more as per indexing needs

My system specifications for where Splunk version 9.0.3 is going to be installed are as follows:

  • 64-bit Windows 11 Pro operating system
  • 6 physical CPU cores (or 12 vCPUs) @ 2.1 GHz clock speed and 16 GB RAM
  • An x86 64-bit AMD chip
  • Plenty of disk space

You might have noticed the physical CPU cores in my PC are fewer than recommended, which is absolutely fine as we are not going to run production workloads on the Splunk instance. Let’s get into the installation steps, as follows.

Installation steps

As a prerequisite, you need a high-speed internet connection to download the Splunk Enterprise free software package from here: https://www.splunk.com/en_us/download.html. If you do not have a Splunk account, then sign up and log in to continue. Choose the installation package by operating system and download the latest version, which is 9.0.3 at the time of writing.

Let’s begin the installation:

  1. Download the .msi file that appears as splunk-9.0.3-dd0128b1f8cd-x64-release.msi. Double-click on it to start the installation. You will be prompted to accept the license with the default installation options. Refer to Figure 1.8 and click the Next button:
Figure 1.8: Installation – license agreement

Figure 1.8: Installation – license agreement

  1. You will be prompted to enter administrator account credentials. Enter the details. Make sure you remember them as you will need them to log in to the Splunk instance for the first time. Click the Next button (refer to Figure 1.9):
Figure 1.9: Installation – creating administrator account credentials

Figure 1.9: Installation – creating administrator account credentials

  1. On the next screen, just click the Install button (refer to Figure 1.10):
Figure 1.10: Installation – click Install to begin

Figure 1.10: Installation – click Install to begin

  1. The setup wizard takes a few minutes to install Splunk Enterprise. If all goes well, a final “successfully installed” screen appears, as shown in Figure 1.11. Clicking on the Finish button will launch the browser window:
Figure 1.11: Installation successful

Figure 1.11: Installation successful

  1. You should observe the first-time login browser window URL: https://127.0.0.1:8000. Here, 8000 is the default Splunk Web port and 127.0.0.1 is the loopback address. Enter the admin credentials created in step 2; then you will be taken to the Splunk Enterprise home page at http://127.0.0.1:8000/en-GB/app/launcher/home:
Figure 1.12: Splunk Enterprise – first-time sign-in page

Figure 1.12: Splunk Enterprise – first-time sign-in page

The installation is successfully completed. Now, let’s summarize what we learned in this chapter in the next section.

 

Summary

We have come to the end of the first chapter. There has definitely been a lot to digest. Let’s briefly summarize what we have learned so far.

In this chapter, we began by looking at the Splunk Certified Admin certification prerequisites, the exam topics, and their weightage. In line with the exam topics, this book is organized into two parts: Splunk Enterprise system administration and data administration. We also discussed the exam pattern, which includes single- and multiple-choice as well as true/false questions.

We looked at the fundamentals of what Splunk Enterprise does and its key highlights as a data analysis product. We then progressed to look at the Splunk Enterprise 9.x product family features, followed by components and their role in deployment.

We also looked at prominent SVAs. We covered single-server, distributed non-clustered, distributed clustered single-site, and distributed clustered multi-site architectures. We discussed their advantages and limitations, showcasing processing and management components. Finally, we successfully installed a Splunk Enterprise single instance on a Windows system.

This chapter is the foundation for the rest of the book. The Splunk components that we looked at will be detailed in further chapters. It is required to know in what context they would be used and how they help in overall Splunk deployment architecture. Though SVAs are not part of the exam guide, they are included in the book to give you a better understanding of the upcoming chapters.

In the next chapter, we are going to deep-dive into license management. License management includes types of licenses, how they work, and license configuration.

In the next section, you are going to practice exam-style questions covering the topics that we have learned so far.

 

Self-assessment

This self-assessment section is to help you better understand which sections you are good at and which need improvements out of the topics covered in the chapter. I would suggest carefully reading the questions and answers and taking your time to go back through the sections that you think need more understanding. Alternatively, you could refer to the Splunk documentation. Good luck!

You will be given 10 questions and answers to choose from. The question patterns are the same as discussed in the Introducing the exam’s test pattern section. At the end of this section, answers to the questions are provided. Let’s get started:

  1. Which of the following are Splunk Enterprise features? (Select all that apply)
    1. Alerting
    2. Search and reporting
    3. File monitoring
    4. Update files that are monitored as completed at the end of the file
  2. Does Splunk support the monitoring of binary files?
    1. Yes
    2. No
  3. The UF and Splunk Enterprise utilize which concept to prevent redundant data indexing?
    1. Bucket ID
    2. Fishbucket
    3. Finbucket
    4. HashBucket
  4. What role does a SH play in a Splunk deployment? (Select all that apply)
    1. Contains dashboards
    2. Consolidates results from indexers
    3. Issues search queries to indexers
    4. Stores knowledge objects such as event types and macros
  5. Indexers store data and respond to search queries. Is this statement true or false?.
    1. True
    2. False
  6. Once data is stored in Splunk indexers, it cannot be directly modified by administrators, even if they wish to make changes. Is this statement true or false?
    1. True
    2. False
  7. Which of the following are management components? (Select all that apply)
    1. UF
    2. Search Head Cluster Deployer (SHC-D)
    3. Cluster manager
    4. Deployment server
  8. In a search head cluster environment, which among the following components is essential in a deployment?
    1. Monitoring console
    2. Cluster manager
    3. SHC-D
    4. Heavy forwarder
  9. What does the License Manager (LM) contain? (Select all that apply)
    1. License file
    2. License stacks and pools
    3. License binary software
    4. License reports
  10. Which configuration (.conf) file contains file monitoring details on a forwarder?
    1. outputs.conf
    2. inputs.conf
    3. server.conf
    4. source.conf

I hope you were able to recollect the topics that we went through with these questions. Let’s review the answers.

Reviewing answers

  1. Options A, B, and C are correct answers. Option D is not a Splunk feature.
  2. Option B, No. Splunk cannot monitor binary files. It is only able to monitor text data.
  3. Option B, Fishbucket, is the right answer. The Splunk Enterprise and the UF use seek pointers and CRCs to track the progress of file monitoring and avoid duplicate data indexing.
  4. Options A, B, C, and D all are functions of an SH.
  5. Option A is correct (True). Indexers store raw events, index files, and other metadata files for search processing.
  6. Option A is correct (True). Once the data is written to an index, it is immutable.
  7. Options B, C, and D are the correct answers. The UF is a data collection component.
  8. Option C is the right answer. The SHC-D deploys apps to all cluster members through the captain.
  9. Options A, B, and D are the correct answers. The LM stores the license file upon adding a license. Optionally, licenses are stackable, and pools can be created on the LM. License reports are available in the MC app, available in the LM instance.
  10. Option B is the correct answer. outputs.conf and server.conf are Splunk configuration files used for different purposes. There is no source.conf file available in Splunk.
About the Author
  • Srikanth Yarlagadda

    Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.

    Browse publications by this author
Splunk 9.x Enterprise Certified Admin Guide
Unlock this book and the full library FREE for 7 days
Start now