Reader small image

You're reading from  Splunk 9.x Enterprise Certified Admin Guide

Product typeBook
Published inAug 2023
PublisherPackt
ISBN-139781803230238
Edition1st Edition
Right arrow
Author (1)
Srikanth Yarlagadda
Srikanth Yarlagadda
author image
Srikanth Yarlagadda

Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.
Read more about Srikanth Yarlagadda

Right arrow

Field Extractions and Lookups

Great! You have reached the last chapter in Part 2, Data Administration. So far in Part 2, we have learned about getting the data into Splunk, adding input, and the parsing phase settings, and we have understood the phases of data traversal before it is written to disk. What we haven’t seen so far is the search phase, which is fundamental for all the work we have done so far in system admin (setting up Splunk) and right after, in the data admin part (tidying up the data and storing it in indexers).

After all, if you have everything set up right and data is indexed correctly, the users who are going to search the data are going to be the real business outcome. For example, if we have indexed sales, API logs, and system logs into Splunk, the respective users could use data to generate a monthly sales report and send it via email, and alert when an API isn’t available for service and/or security teams have found a vulnerability of a certain...

Understanding fields and lookups

In this section, we will look at fields and lookups in detail. Let’s start with fields.

Fields

Fields in Splunk tell a story about data that can be used to search and derive the required outcomes, such as reports, alerts, and dashboards. Raw data in Splunk is indexed as individual events by its source type definition. Fields are names given to specific portions of data by data administrators by extracting them out of the raw data during the search-time and index-time processes. Splunk, by default, assigns predefined fields to a data source, such as host, source, sourcetype, splunkserver, _time, and so on.

For example, take call record logs, which contain phone numbers from exchanged calls at a particular point in time. Let’s name the fields in the log data time_of_call, caller, callee, and duration.

You could build a report such as the number of calls per day and longest duration, and alert on numbers that are calling out to...

Creating search-time field extractions

The SH plays a crucial role in performing search-time extractions. When a user issues a search request, the SH executes the search and carries out search-time field extractions as part of the process.

The SH applies the defined extraction rules or patterns to the raw data during the search execution. It dynamically extracts the relevant fields and associated values from the events based on these rules. The extracted fields are then used for further analysis, filtering, visualization, or other operations.

By performing search-time extractions, the SH allows users to retrieve structured and meaningful information from the raw data without modifying the underlying data source. In addition, the following are other ways in which Splunk automatically extracts fields/values during search time without user involvement:

  • Name=value, also called key-value, pairs in raw data are auto-extracted
  • JSON structured data is auto-extracted
...

Creating index-time field extractions

Indexed extractions, also referred to as index-time field extractions, involve the systematic extraction of specific fields from raw data during the parsing phase of the data ingestion journey. These extractions are defined and implemented by data administrators, who specify the fields to be extracted. As part of this process, the extracted fields are not only captured but also persistently stored within the designated index, ensuring their long-term availability for subsequent analysis and retrieval.

If you recall from Chapter 8, in the Data indexing phases section, we learned about input, parsing, and indexing.

There is a special case for structured data: at input time, setting INDEXED_EXTRACTIONS in props.conf and deploying to a Universal Forwarder (UF) stores the fields in an index. In this case, data doesn’t go through the parsing phase; it skips it and goes directly to the indexing phase. Let’s look at the important facts...

Creating lookups

We explored what a lookup is and some of its types in the Understanding fields and lookups section. Lookups in Splunk are crucial for enriching and correlating data, enabling efficient analysis and advanced search capabilities. In this section, we are going to create CSV and KV Store lookup files using Splunk Web.

As an example, we will use a lookup of country codes to country names as follows. If you recall the callrecords sample from the previous section, the data contains phone numbers with country codes, but we can’t find out the origin country of the phone numbers from callrecords Splunk events alone. In order to obtain the country name from the country code, the knowledge managers or data administrators create additional lookups. The lookup can be further used in Splunk queries to correlate country codes with the lookup and retrieve country names from it.

Save the following contents in a file as phone_no_country_code_to_name.csv in your local system...

Summary

The chapter combines two interesting topics: fields and lookups. We began with fields (or knowledge objects) and understood that they are the building blocks for search (SPL) and that they can be extracted out of raw data while presenting results to search requests issued by users during search time. The SH, on the fly, extracts fields/values based on the extraction’s settings pre-created by data admins on the SH. Splunk, by default, extracts fields from data formats such as KV pairs and JSON contents during search time. Then, we introduced lookup types and their purpose for data enrichment use cases.

Similarly, another approach is to define the fields through indexed extraction settings during data indexing time; they are called index-time fields/indexed fields. Fields created during indexing time are stored in a designated index permanently, so they consume additional storage space, and using them is a less preferred approach over using search-time fields. The reason...

Self-assessment

Like other chapters, this section is about self-testing your knowledge of the topics that we have learned about so far. You have 10 questions, and the answers are given after the questions. As reiterated every time, feel free to review the respective sections once again if you have difficulty answering questions. Let’s begin:

  1. Are field names case-sensitive?
    1. Yes
    2. No
  2. Does Splunk auto-extract fields/values from the key-value format in search time?
    1. Yes
    2. No
  3. Select the types of lookup files available in Splunk (select all that apply):
    1. JSON lookups
    2. CSV lookups
    3. KV Store lookups
    4. Geospatial lookups
  4. Which statements are true about CSV lookups? Select all that apply:
    1. CSV lookups store information in MongoDB
    2. CSV lookup files have the .csv extension
    3. CSV lookup files are used to enrich information by correlating with existing events in Splunk.
    4. CSV lookups contain information in comma-separated values with a header
  5. What are the two types of Splunk Web methods available...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Splunk 9.x Enterprise Certified Admin Guide
Published in: Aug 2023Publisher: PacktISBN-13: 9781803230238
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Srikanth Yarlagadda

Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.
Read more about Srikanth Yarlagadda