Packt+ | Advance your knowledge in tech

You're reading from Splunk Operational Intelligence Cookbook - Second Edition

Product typeBook

Published inJun 2016

Publisher

ISBN-139781785284991

Edition2nd Edition

Tools

Splunk

Concepts

Operational Intelligence

Authors (4):

Jose E. Hernandez

Josh Diakun

Derek Mock

Paul R. Johnson

View More author details

Indexing files and directories

File and directory-based inputs are the most commonly used ways of getting data into Splunk. The primary need for these types of inputs will be to index logfiles. Almost every application or system produces a logfile, and it is generally full of data that you want to be able to search and report on.

Splunk is able to continuously monitor for new data being written to existing files or new files being added to a directory, and it is able to index this data in real time. Depending on the type of application that creates the logfiles, you would set up Splunk to either monitor an individual file based on its location or scan an entire directory and monitor all the files that exist within it. The latter configuration is more commonly used when the logfiles being produced have unique filenames, such as filenames containing a timestamp.

This recipe will show you how to configure Splunk to continuously monitor and index the contents of a rolling logfile located on the Splunk server. The recipe specifically shows how to monitor and index a Red Hat Linux system's messages logfile (/var/log/messages). However, the same principle can be applied to a logfile on a Windows system, and a sample file is provided. Do not attempt to index the Windows event logs this way, as Splunk has specific Windows event inputs for this.

Getting ready

To step through this recipe, you will need a running Splunk Enterprise server and access to read the /var/log/messages file on Linux. No other prerequisites are required. If you are not using Linux and/or do not have access to the /var/log/messages location on your Splunk server, use the cp01_messages.log file that is provided and upload it to an accessible directory on your Splunk server.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

How to do it…

Follow the steps in the recipe to monitor and index the contents of a file:

Log in to your Splunk server.
From the menu in the top right-hand corner, click on the Settings menu and then click on the Add Data link.
If you are prompted to take a quick tour, click on Skip.
In the How do you want to add data? section, click on monitor.
Click on the Files & Directories section.
In the File or Directory section, enter the path to the logfile (/var/log/messages or the location of the cp01_messages.log file), ensure Continuously Monitor is selected, and click on Next.
Tip
If you are just looking to do a one-time upload of a file, you can select Index Once instead. This can be useful to index a set of data that you would like to put into Splunk, either to backfill some missing or incomplete data or just to take advantage of its searching and reporting tools.
Assuming that you are using the provided file or the native /var/log/messages file, the data preview will show the correct line breaking of events and timestamp recognition. Click on the Next button.
A Save Source Type box will pop up. Enter linux_messages as the Name and then click on Save.
On the Input Settings page, leave all of the default settings, and click Review.
Review the settings and if everything is correct, click Submit.
If everything was successful, you should see a File input has been created successfully message.
Click on the Start searching button. The Search & Reporting app will open with the search already populated based on the settings supplied earlier in the recipe.
Tip
In this recipe, we could have simply used the common syslog source type or let Splunk choose a source type name for us; however, starting a new source type is often a better choice. The syslog format can look completely different depending on the data source. As knowledge objects, such as field extractions, are built on top of source types, using a single syslog source type for everything can make it challenging to search for the data you need.

How it works…

When you add a new file or directory data input, you are basically adding a new configuration stanza into an inputs.conf file behind the scenes. The Splunk server can contain one or more inputs.conf files, and these files are either located in $SPLUNK_HOME/etc/system/local or in the local directory of a Splunk app.

Splunk uses the monitor input type and is set to point to either a file or a directory. If you set the monitor to point to a directory, all the files within that directory will be monitored. When Splunk monitors files, it initially starts by indexing all the data that it can read from the beginning. Once complete, Splunk maintains a record of where it last read the data from, and if any new data comes into the file, it reads this data and advances the record. The process is nearly identical to using the tail command in Unix-based operating systems. If you are monitoring a directory, Splunk also provides many additional configuration options, such as blacklisting files you don't want Splunk to index.

Tip

For more information on Splunk's configuration files, visit http://docs.splunk.com/Documentation/Splunk/latest/Admin/Aboutconfigurationfiles.

There's more…

While adding inputs to monitor files and directories can be done through the web interface of Splunk, as outlined in this recipe, there are other approaches to add multiple inputs quickly. These allow for customization of the many configuration options that Splunk provides.

Adding a file or directory data input via the CLI

Instead of going via the GUI, you can add a file or directory input via the Splunk CLI (command-line interface). Navigate to your $SPLUNK_HOME/bin directory and execute the following command (replacing the file or directory to be monitored with your own):

For Unix:

./splunk add monitor /var/log/messages –sourcetype linux_messages

For Windows:

splunk add monitor c:\filelocation\cp01_messages.log –sourcetype linux_messages

There are a number of different parameters that can be passed along with the file location to monitor.

Note

See the Splunk documentation for more on data inputs using the CLI (http://docs.splunk.com/Documentation/Splunk/latest/Data/MonitorfilesanddirectoriesusingtheCLI).

Adding a file or directory input via inputs.conf

Another common method of adding the file and directory inputs is to manually add them to the inputs.conf configuration file directly. This approach is often used for large environments or when configuring Splunk forwarders to monitor for files or directories on endpoints.

Edit $SPLUNK_HOME/etc/system/local/inputs.conf and add your input. After your inputs are added, Splunk will need to be restarted to recognize these changes:

For Unix:

[monitor:///var/log/messages]
sourcetype = linux_messages

For Windows:

[monitor://c:\filelocation\cp01_messages.log]
sourcetype = linux_messages

Note

Editing inputs.conf directly is often a much faster way of adding new files or directories to monitor when several inputs are needed. When editing inputs.conf, ensure that the correct syntax is used and remember that Splunk will need a restart for modifications to take effect. Additionally, specifying the source type in the inputs.conf file is best practice to assign source types.

One-time indexing of data files via the Splunk CLI

Although you can select Upload and Index a file from the Splunk GUI to upload and index a file, there are a couple of CLI functions that can be used to perform one-time bulk loads of data.

Use the oneshot command to tell Splunk where the file is located and which parameters to use, such as the source type:

./splunk add oneshot XXXXXXX

Another way is to place the file you wish to index into the Splunk spool directory, $SPLUNK_HOME/var/spool/splunk, and then add the file using the spool command:

./splunk spool XXXXXXX

Tip

If using Windows, omit ./ that is in front of the Splunk commands mentioned earlier.

Indexing the Windows event logs

Splunk comes with special inputs.conf configurations for some source types, including monitoring the Windows event logs. Typically, the Splunk Universal Forwarder (UF) would be installed on a Windows server and configured to forward the Windows events to the Splunk indexer(s). The configurations for inputs.conf to monitor the Windows security, application, and event logs in real time are as follows:

[WinEventLog://Application]
disabled = 0 
[WinEventLog://Security]
disabled = 0 
[WinEventLog://System]
disabled = 0

By default, the event data will go into the main index, unless another index is specified.

Authors (4)

Jose E. Hernandez

Jose Hernandez is currently the Director of Security Solutions at Zenege Inc. with a vast experience in security analytics. He started his professional career at Prolexic Technologies (now Akamai) in DDOS fighting attacks from anonymous and lulzsec against fortune 100 companies. While working at Splunk Inc. as a Security Architect, he built and released an auto-mitigation framework that has been used to automatically fight attacks in large organizations. In the past, he has helped build security operation centers as well as run a public threat intelligence service. Jose is originally from Miami, Florida where he completed his Master's degree in Information Security from Nova Southeastern University. He also achieved two undergraduate Bachelor degrees from Florida International University in the field of Management of Information Systems and Information Technologies. Although security information has been the focus of his career, Jose has found that his true passion is in solving problems and creating solutions. As an example, he built an underwater remote control vehicle called the SensorSub, which was used to test and measure toxicity in Miami's waterways. As per the contact information, my email is josehelps@gmail.com, twitter: divious_1 and github divious1
Read more about Jose E. Hernandez

Josh Diakun

Josh Diakun is an IT operations and security specialist with a focus on creating data-driven operational processes. He has over 10 years of experience managing and architecting enterprise-grade IT environments. For the past 7 years, he has been architecting, deploying and developing on Splunk as the core platform for organizations to gain security and operational intelligence. Josh is a founding partner at Discovered Intelligence, a company specializing in data intelligence services and solutions. He is also a co-founder of the Splunk Toronto User Group.
Read more about Josh Diakun

Derek Mock

Derek Mock is a software developer and big data architect who specializes in IT operations, information security, and cloud technologies. He has 15 years' experience developing and operating large enterprise-grade deployments and SaaS applications. He is a founding partner at Discovered Intelligence, a company specializing in data intelligence services and solutions. For the past 6 years, he has been leveraging Splunk as the core tool to deliver key operational intelligence. Derek is based in Toronto, Canada, and is a co-founder of the Splunk Toronto User Group.
Read more about Derek Mock

Paul R. Johnson

Paul R. Johnson has over 10 years of data intelligence experience in the areas of information security, operations, and compliance. He is a partner at Discovered Intelligence, a company specializing in data intelligence services and solutions. Paul previously worked for a Fortune 10 company, leading IT risk intelligence initiatives and managing a global Splunk deployment. Paul co-founded the Splunk Toronto User Group and lives and works in Toronto, Canada.
Read more about Paul R. Johnson

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Splunk Operational Intelligence Cookbook - Second Edition

Indexing files and directories

Getting ready

Tip

How to do it…

Tip

Tip

How it works…

Tip

There's more…

Adding a file or directory data input via the CLI

Note

Adding a file or directory input via inputs.conf

Note

One-time indexing of data files via the Splunk CLI

Tip

Indexing the Windows event logs

See also

Authors (4)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook