Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Splunk 9.x Enterprise Certified Admin Guide

You're reading from  Splunk 9.x Enterprise Certified Admin Guide

Product type Book
Published in Aug 2023
Publisher Packt
ISBN-13 9781803230238
Pages 256 pages
Edition 1st Edition
Languages
Author (1):
Srikanth Yarlagadda Srikanth Yarlagadda
Profile icon Srikanth Yarlagadda

Table of Contents (17) Chapters

Preface Part 1: Splunk System Administration
Chapter 1: Getting Started with the Splunk Enterprise Certified Admin Exam Chapter 2: Splunk License Management Chapter 3: Users, Roles, and Authentication in Splunk Chapter 4: Splunk Forwarder Management Chapter 5: Splunk Index Management Chapter 6: Splunk Configuration Files Chapter 7: Exploring Distributed Search Part 2:Splunk Data Administration
Chapter 8: Getting Data In Chapter 9: Configuring Splunk Data Inputs Chapter 10: Data Parsing and Transformation Chapter 11: Field Extractions and Lookups Chapter 12: Self-Assessment Mock Exam Index Other Books You May Enjoy

Data Parsing and Transformation

The first phases of the data journey is the input phase, which we discussed in detail in Chapter 9, Configuring Splunk Data Inputs. Data parsing is the second phase, followed by data being indexed on the disk. This chapter deals with the parsing phase, which comes right after the input phase and ends by handing over the data to the index phase for storage and preparation for data searching.

The question that might arise is what the need for the parsing phase is, as all the data has been collected, the metadata fields are set during the input phase, and finally, data is forwarded to indexers for indexing. The prominent features of the parsing phase are breaking the whole data stream into individual events, extracting and applying timestamps, setting the metadata fields to individual events, manipulating metadata before indexing, and transforming the data if needed. During the input phase, metadata fields such as the index, host, sourcetype, source...

Parsing phase settings

Going through the parsing phase is crucial before data is indexed. Parsing happens right after the input phase – the process is input -> parsing -> indexing, in this order. You can refer to the Data indexing phases section of Chapter 8, which introduced these phases, for a refresher. The data stream must be preprocessed before indexing. The parsing phase in Splunk is necessary for formatting and extracting relevant data from unstructured or semi-structured input, making the data searchable and actionable. The following are some of the important sub-phases that data goes through:

  • Breaking the whole data stream into individual events
  • Identifying the timestamp of an event if needed and applying it
  • Applying metadata fields such as the host, sourcetype, source, and index
  • Optionally transforming the data by re-routing, overriding metadata, masking portions of events, filtering and dropping unnecessary events, and so on

These...

Splunk Web data preview

The Splunk Web data preview, Add Data, and Upload features are useful for testing the source type settings defined according to the props.conf specification. This works for the [<source type>] stanza.

Why is this so important? For example, say you create a new source type with line-breaking and timestamp extraction settings through a text editor or the New Source Type option available on Splunk Web at Settings | Source types and deploy it. Afterward, you realize the data indexed is not correctly formatted. You have to troubleshoot and find the invalid setting and then redo the same deployment again and again until everything is right. It is tedious, isn’t it? The data preview feature comes in handy for testing the source type and transforms.conf settings. The transforms.conf settings cannot be directly tested in the UI; however, we could pre-create them and refer to them when testing the source type settings. Remember that transforms.conf stanzas...

Summary

This whole chapter revolved around the props.conf and transforms.conf settings up until the end and was purely technical. We began by understanding the parsing phase, coming right after the input phase, and its significance. Out of the three components, the full parsing pipeline exists on the HF and the indexer, and not on the UF; however, the UF is able to parse structured files through the INDEXED_EXTRACTIONS setting. We learned that it is mandatory to deploy parsing settings on the HF if the indexers are fronted by it.

Afterward, we looked at the props.conf stanzas related to sourcetype, source, and host and went through the specification of the source type definition for line breaker, line merging, and timestamp identification. To continue, we learned about the transforms.conf stanzas that work in accordance with props.conf. We further advanced into applying the SEDCMD setting via props.conf for data masking and the transforms.conf settings for overriding the source...

Self-assessment

In this section, there are 10 questions followed by answers for you to test your learning throughout this chapter. If you run into difficulty in answering any of the questions, do refer to the respective section:

  1. What is the next phase in data indexing right after the input phase?
    1. Indexing
    2. Parsing
    3. Masking
    4. Routing
  2. Identify the stanzas that can be created in props.conf. (Select all that apply.)
    1. [source]
    2. [sourcetype_name]
    3. [source::<source_name_or_pattern>]
    4. [host]
    5. [host::<hostname_or_pattern>
  3. What are the parsing phase capabilities? (Select all that apply.)
    1. Override a source type
    2. Re-route to a different index
    3. Data masking
    4. Drop events that do not need to be indexed
  4. What is the name of the queue to be set for dropping unwanted events from indexing?
    1. sinkHole
    2. deadQueue
    3. nullQueue
    4. dropQueue
  5. The SEDCMD setting belongs to the transforms.conf file. Is this statement true or false?
    1. True
    2. False
  6. What is the limit of the file upload size in Splunk Web for the Set...
lock icon The rest of the chapter is locked
You have been reading a chapter from
Splunk 9.x Enterprise Certified Admin Guide
Published in: Aug 2023 Publisher: Packt ISBN-13: 9781803230238
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}