Packt+ | Advance your knowledge in tech

You're reading from Advanced Splunk

Product typeBook

Published inJun 2016

Publisher

ISBN-139781785884351

Edition1st Edition

Tools

Splunk

Concepts

Operational Intelligence

Author (1)

Ashish Kumar Tulsiram Yadav

Chapter 10. Tweaking Splunk

We have already learned some important features of Splunk, creating analytics and visualizations, along with various dashboard customization techniques. Now we will learn about various ways we can tweak Splunk so that we can get the most out of it and that to efficiently. In this chapter we will learn various management and customization techniques for using Splunk in the best possible way.

In this chapter, we will cover the following topics in detail, along with example and uses.

Index replication
Indexer auto-discovery
Sourcetype manager
Field extractor
Search history
Event pattern detection
Data acceleration
Splunk buckets
Search optimizations
Splunk health

Index replication

Splunk supports a distributed environment. Now, when it is said that Splunk supports a distributed environment, what does this actually mean? What is the use of Splunk being deployed in a distributed environment?

Splunk can be deployed in a standalone environment and in a distributed environment as well. Let us understand what a standalone environment, a distributed environment, and index replication are.

Standalone environment

In a standalone environment, various components of Splunk, like the indexer or search head are available on a single machine, which handles everything from on-boarding data on Splunk, indexing the data, analytics and visualization, reporting, and so on. Generally, standalone is used for development and testing purposes; it is not at all recommended for deployment scenarios.

Distributed environment

In a distributed environment, various components of Splunk (the indexer, search head, and others) are deployed in clusters. Deploying in a clustered environment...

Indexer auto-discovery

Splunk 6.3 introduced a very usable and important feature for distributed environments. This feature simplifies forwarder management, which automatically detects new peer nodes in a cluster, and thus, load balancing is handled by itself.

Example

Let us understand the use of indexer auto-discovery using the following cluster example image. The following image shows forwarders sending data to peer nodes. The peer node list and other relevant messages are being communicated from the cluster master to the forwarders:

The following are the uses/advantages of indexer auto-discovery:

There is no need for configuration on forwarders specifying the number of peer nodes in the given cluster. The forwarder is automatically informed with the updated list of peer nodes by the master. Thus, when a peer node fails or new peer nodes are added in a cluster, there is no configuration requirement on forwarders.
There is no need to know the number of peer nodes when adding or removing a forwarder...

Sourcetype manager

Sourcetype manager is another very useful provision added in Splunk 6.3, which can be used to manage the sourcetype for on-boarding the data on Splunk. It can be used to manage (create, modify, and delete) sourcetype configurations independent of getting data in and searching within the sourcetype picker. We have already learned in the Chapter 2, Developing Application on Splunk about how to assign and configure sourcetype while uploading the data on Splunk.

Sourcetype manager enlists all the sourcetype configured in the Splunk instance along with the inbuilt default sourcetypes. The sourcetype manager can be accessed by navigating in the Splunk Web console to Settings | Data | Sourcetype.

Now let us learn what can be done from the sourcetype manager:

Create a sourcetype: In previous versions of Splunk when sourcetype manager was not present to create a sourcetype, first we needed to add data to Splunk or else the inputs.conf file needed to be configured manually.
Using the...

Field extractor

In Splunk, for any kind of analytics and visualizations, fields play a very important role. Splunk automatically tries to extract and make them available for use for known and properly configured data sources. Since there are a wide variety of sources for data, there could be many fields which do not get automatically extracted. Splunk also provides the Splunk command rex, which can be used to extract the fields, but this command requires a good understanding of regular expressions to efficiently extract fields from the data. So Splunk provides a very easy to use field extractor to extract fields using an interactive field extractor tool via the Splunk Web interface.

Accessing field extractor

Let us learn to access the field extractor to extract fields from the data, which in turn can be used to create analytics and visualizations in Splunk.

The field extractor can be accessed via the following options:

Splunk Web Console | Settings | Fields | Field Extractions | Open Field...

Search history

Search history is another useful feature introduced in Splunk 6.3 which can be used to view and interact with history of the search command. This feature can be used to get the complete list of search queries executed on Splunk over time.

The search history feature can be accessed via the Splunk Web console by clicking on "Search & Reporting" App | Search. It takes the user to the search summary dashboard with the option to run search queries.

The following image shows the search summary dashboard from where the search history can be accessed:

The Search History option enables the following information on the screen:

The exhaustive list of search queries run on the Splunk instance along with the time of the last run
The Action option to directly copy the respective search query in the Search bar so as to run the search query right away
The Filter option to choose the list of queries shown on the basis of time defined in the time range picker or some specific word/string which...

Event pattern detection

Event pattern detection is a feature in Splunk which helps in increasing the speed of analytics by automatically grouping similar events to discover meaningful insight in the given machine data. It helps users to quickly discover relationship, patterns, and anomalies in the given data, to build meaningful analytics on top of it.

In simpler terms, event pattern detection not only helps to find out the common patterns in the data but also highlights those events which are rare and could be anomalies. The event pattern detection feature of Splunk can be helpful in the following ways:

Auto discover meaningful patterns in the given dataset
Search data without the need to know what to search for
Detection of anomalies, rare events, and so on

The following image shows a sample of data events when queried on Splunk. The sample data has mostly numbers in it, and if not much domain information is available about the data it would be difficult to get insight from it:

Now we will see...

Data acceleration

Splunk is a big data tool and hence, it is obvious that the reports and dashboards created on Splunk will have large datasets/events. So data acceleration is very much necessary to get real-time analytics and visualizations.

Need for data acceleration

Let's understand the need for data acceleration in reports and dashboards with the help of the following image. The following image is an example screenshot of a dashboard with many panels and thus, many searches. When there are many searches running concurrently in a report/dashboard then it takes time to show the analytics or visualization on the dashboard. Thus for real-time analytics, data acceleration will be required:

Splunk is a very powerful big data tool, so why does it takes time to populate the results on the dashboard/report? The reason behind why some searches complete quickly and some take too much time can be explained with the help of the following facts:

Splunk is very fast at finding a keyword or set of keywords...

Splunk buckets

The Splunk Enterprise stores its index's data into buckets organized by age. Basically, it is a directory containing events of a specific period. There can be several buckets at the same time in the various stages of the bucket life cycle.

A bucket moves from one stage to another depending upon its age, size, and so on, as per the defined conditions. The Splunk bucket stages are Hot, Warm, Cold, Frozen, and Thawed. Splunk buckets play a very important role in the performance of search results and hence they should be properly configured as per the requirements.

The following image shows the life cycle of Splunk buckets:

Let us understand the Splunk bucket life cycle, taking the above image as a reference. The Indexes.conf file can be modified to configure the aging and the conditions to move from one stage to another:

Hot bucket: Whenever any new data gets indexed on Splunk Enterprise, it is stored in a hot bucket. There can be more than one hot bucket for each index. The data...

Search optimizations

We have already learned data acceleration and the bucket life cycle in the preceding section. Let us now see how we can make the best use of search queries for better and more efficient results. Splunk search queries can be optimized depending upon the requirements and conditions. Generally, the search queries which need to be optimized are those which are used most frequently. Let us learn a few tricks to optimize the search for faster results.

Time range

We have already learned about Splunk buckets, which organize events based on time. The shorter the time span, the less buckets will be accessed to get the information of the search result. It has always been a common practice to use All time in the time range picker for any search, irrespective of whether the result is required for all of the duration or some limited duration.

So one of the best search optimization methods is to use the time range picker to specify the time domain on which the search should run to get...

Splunk health

It is very important to keep track of Splunk's health status. Splunk Enterprise keeps logging various important information which can be helpful in the various stages of Splunk usage. Splunk's log and Splunk Enterprise can be used together to keep track of Splunk's health and various other important measures related to Splunk Enterprise. The Splunk logs can be useful in troubleshooting, system maintenance and tuning, and so on.

The following activities can be tracked by using Splunk's inbuilt logging mechanism:

Resource utilization and Splunk license usage
Data indexing, searching, analytics-related information, warnings, and errors
User activities and application usage information
Splunk component performance-related information

Splunk logging ranges from a wide variety of sources like audit log, kvstore log, conf log, crash log, license log, splunkd log, and many more.

splunkd log

Of all the sources for Splunk logs, one of the most important and useful logs is splunkd.log. This log...

Summary

In this chapter we have read about various features of Splunk which can be used to utilize Splunk for better, more efficient, and faster analytics. We have learned various tools like sourcetype manager, field extractor, event pattern detection, and so on. We also had a look at data acceleration, efficient search queries, and various other important tweaks of Splunk Enterprise. In the next chapter we will learn about enterprise integration of Splunk with various other analytics and visualization tools.

The rest of the chapter is locked

You have been reading a chapter from

Advanced Splunk

Published in: Jun 2016Publisher: ISBN-13: 9781785884351

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ashish Kumar Tulsiram Yadav

Ashish Kumar Tulsiram Yadav is a BE in computers and has around four and a half years of experience in software development, data analytics, and information security, and around four years of experience in Splunk application development and administration. He has experience of creating Splunk applications and add-ons, managing Splunk deployments, machine learning using R and Python, and analytics and visualization using various tools, such as Tableau and QlikView. He is currently working with the information security operations team, handling the Splunk Enterprise security and cyber security of the organization. He has worked as a senior software engineer at Larsen & Toubro Technology Services in the telecom consumer electronics and semicon unit providing data analytics on a wide variety of domains, such as mobile devices, telecom infrastructure, embedded devices, Internet of Things (IOT), Machine to Machine (M2M), entertainment devices, and network and storage devices. He has also worked in the area of information, network, and cyber security in his previous organization. He has experience in OMA LWM2M for device management and remote monitoring of IOT and M2M devices and is well versed in big data and the Hadoop ecosystem. He is a passionate ethical hacker, security enthusiast, and Linux expert and has knowledge of Python, R, .NET, HTML5, CSS, and the C language. He is an avid blogger and writes about ethical hacking and cyber security on his blogs in his free time. He is a gadget freak and keeps on writing reviews on various gadgets he owns. He has participated in and has been a winner of hackathons, technical paper presentations, white papers, and so on.
Read more about Ashish Kumar Tulsiram Yadav

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages