Reader small image

You're reading from  Splunk 9.x Enterprise Certified Admin Guide

Product typeBook
Published inAug 2023
PublisherPackt
ISBN-139781803230238
Edition1st Edition
Right arrow
Author (1)
Srikanth Yarlagadda
Srikanth Yarlagadda
author image
Srikanth Yarlagadda

Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.
Read more about Srikanth Yarlagadda

Right arrow

Configuring Splunk Data Inputs

Getting data into Splunk Enterprise is the primary responsibility of a data administrator. There are multiple ways to get data into Splunk, including the standard data inputs that are popular and used across a range of data input sources. In this chapter, we will learn about these data inputs in more detail, including the suitability of these inputs with regard to data sources, and how to create monitoring inputs and adjust the configuration settings.

We’ll cover the following topics in this chapter:

  • File and directory monitoring
  • Network inputs (TCP/UDP)
  • Scripted inputs
  • HTTP Event Collector (HEC) aka agentless data input
  • Windows inputs

We explored these data inputs briefly in Chapter 8, Getting Data In. Splunk Enterprise is built for data, it works on data, and it returns data for various business use cases. Data administrators involved in getting data into Splunk must adopt the correct approach, set metadata accurately...

File and directory monitoring

As the name implies, this input type monitors the data files and the directories they’re stored in. This is the most effective way to bring in data and is recommended by Splunk as one of the best approaches to handling files. The monitoring settings are configured in the inputs.conf file. To enable this input, a UF agent is required on the source machine, and the same settings also work on Heavy Forwarders (HFs) and Splunk Enterprise. Let’s look at the notable features of this input type:

  • Works for all text-based files including structured formats (XML, CSV, JSON, etc.) and .gzip-compressed files.
  • Keeps track of the files being monitored via checkpoints maintained in a fishbucket directory, under $SPLUNK_HOME/var/lib/splunk/.
  • Resumes the file and directory monitoring from the last location in the event of forwarder restarts.
  • Recursively discovers all the files in a directory, including any new files created.
  • Uncompresses...

Handling network data input

The network data input type is available for sources that can only send data over TCP/UDP. Sources such as IoT devices, network switches, routers, and sensors rely on TCP/UDP layer-4 protocols, the indexing of data from which is supported by Splunk Enterprise. Here are some important details about this input type:

  • The UF and HF both support network input.
  • In Splunk Enterprise, the indexer instance is usually preceded or “fronted” by the UF or HF to handle the task of forwarding data for indexing. The connection from UF/HF to Splunk Enterprise must use a valid Socket Secure Layer (SSL) certificate.
  • Transmission Control Protocol (TCP) is more reliable than SSL User Data Protocol (UDP), as the latter doesn’t guarantee the delivery of network packets.
  • UDP messages in Splunk are not indexed as individual events until a timestamp is found in the data stream. This can be fixed during the parsing phase by configuring sourcetype...

Discussing scripted inputs

Scripted inputs are useful for indexing transient/temporary data that cannot be monitored through file/directory monitoring or use network inputs. Scripted inputs collect data from transient sources and then either write the collected data to a file or forward it directly to an indexer. Let’s go through some facts about this input type:

  • Scripted inputs require a UF agent or Splunk Enterprise instance (HF) to execute the scripts.
  • Data can be gathered from transient sources, such as operating system commands. The top, vmstat, netstat, and iostat commands all leverage this type of input, which is configured within the Splunk add-on for Unix and Linux available to download from http://splunkbase.splunk.com. The Windows technology add-on (TA) relies on this input type to gather Windows Active Directory (AD) logs, registry logs, WinEventLogs, and so on. Logs from remote APIs can also be pulled using scripts.
  • Popular script types are supported...

Understanding HEC input

HEC is an agentless input type that doesn’t require a forwarder on the source machine. This input type is suitable for sources that are capable of sending events over HTTP(S), such as web apps (through JavaScript libraries), mobile apps, and automation scripts. HEC exposes RESTful API endpoints on the Splunk Enterprise instance to accept data for indexing. The instance could be an HF/indexer in a distributed deployment. Let’s look at the key facts about HEC input:

  • HEC is disabled by default on Splunk instances, and the user must enable it manually to start using it. HEC can be scaled by configuring it across multiple Splunk instances and optionally fronting it with a load balancer.
  • Authentication to HEC APIs is done via a token supplied in the HTTP request sent by the source. The token configuration is set up by the administrator and shared with the application/source team for them to send over the events.
  • HEC exposes two important...

Exploring Windows inputs

The input types that we have gone through so far are neither technology- nor OS-specific. However, here, as the name suggests, Windows inputs work on Windows-only hosts. The host requires at least a UF or Splunk Enterprise instance to collect Windows-specific logs. Windows natively stores its logs in binary format, and the Windows inputs interact with OS APIs to get these logs.

When installing a UF on Windows hosts, you are given the option to enable Windows inputs. With a Splunk Enterprise instance, Windows inputs can be configured through Splunk Web via Settings | Data Inputs and then choosing Local event log collection and Remote event log collection. Remote event log collection requires a Windows domain account for log collection from remote hosts.

In a large-scale Windows host environment with UF already configured, use a deployment server to centrally manage inputs. Create and configure inputs.conf within the app. Deploy the app to the forwarders...

Summary

In this chapter, we learned heaps about configuration settings for various data inputs that are very useful when getting data into Splunk. We began with the steps to install a UF and jumped into the file and directory monitoring input, understanding how this is used to monitor files and directories recursively. We also learned about (the three-dot notation/ellipsis used to traverse the directories in the filesystem path recursively) and * (the wildcard notation) in a monitor file path. We understood the use of fishbucket to keep track of files monitored using checksums and how it can be reset using the btprobe command.

We looked into network inputs used to accept data over TCP/UDP. TCP is more reliable than UDP, although their configuration specifications are very similar. Afterward, we covered the scripted input type that executes scheduled scripts and indexes transient data. Scripted inputs are commonly utilized in numerous Splunkbase apps, necessitating adherence...

Self-assessment

  1. Select the popular data input types offered by Splunk by default. (Choose all that apply):
    1. Binary file monitoring
    2. File and directory monitoring
    3. Network data input
    4. Scripted input
    5. Network port monitoring
  2. You are about to configure a file monitoring input and observe that the directory contains five-year-old data that does not need to be indexed. Which setting is used to force the forwarder to ignore the old files?
    1. skipHistoricalFiles
    2. ignoreOldData
    3. ignoreOlderThan
    4. deletePastData
  3. A network device can send traffic over UDP on port 514. You are an admin and need to allow incoming traffic from the network device IP. What setting is appropriate for this situation?
    1. acceptOnly
    2. acceptFrom
    3. connection_host
    4. acceptIPOnly
  4. You have been given the task of configuring the monitoring of files with the names network-sys-messages.log, sys-messages.log, and syslogs.log in a directory path of /opt/var/log/syslog/ on a Linux system. They all fit into a single source type – syslog...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Splunk 9.x Enterprise Certified Admin Guide
Published in: Aug 2023Publisher: PacktISBN-13: 9781803230238
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Srikanth Yarlagadda

Srikanth is a highly accomplished IT professional with a diverse range of expertise in the technology industry. Having completed his Masters in Computer Applications in 2009, he has since honed his skills in Java, Oracle SOA, and API development, gaining valuable experience along the way. With over 13 years of experience in the field, Srikanth is now a Splunk Certified Architect and was recently selected to join the esteemed cohort of SplunkTrust in 2022. He has extensive knowledge of various Splunk products, including Splunk Enterprise Security and SOAR, and he is currently dedicated to Threat Detection and Security Automation using Splunk ES & SOAR. Srikanth's impressive work history includes significant roles at major telecom companies across Norway and Pan Europe. Beyond technology, Srikanth's greatest joy is his family. Along with his wife and two children, he calls Australia home and enjoys spending time together while staying active.
Read more about Srikanth Yarlagadda