Splunk, whose name was inspired by the process of exploring caves, or spelunking, helps analysts, operators, programmers, and many others explore many types of data, including raw machine data from their organizations, by collecting, analyzing, and acting on them. This multinational company, cofounded by Michael Baum, Rob Das, and Erik Swan, has a core product called Splunk Enterprise. This product manages searches, inserts, deletes, filters, and analyzes big data that is generated by machines, as well as many other types of data.
Throughout the book, we will be covering the fundamental, barebones concepts of Splunk so you can learn quickly and efficiently. We reserve any deep discussion of concepts to Splunk's online documentation. Where necessary, we provide links to help provide you with the practical skills, and examples, so you can get started quickly. All images and exercise materials used in this book are available at http://github.com/ericksond/splunk-essentials. Instructions for Mac OS X can also be found in the GitHub repository mentioned in the preceding link.
With very little time, you can achieve direct results using Splunk, which you can access through a free enterprise trial license. While this license limits you to 500 MB of data ingested per day, it will allow you to quickly get up to speed with Splunk and learn the essentials of this powerful software.
The exercises in this chapter may look challenging at first, but if you follow what we've written closely, we believe you will quickly learn the fundamentals you need to use Splunk effectively. Together, we will make the most of the Trial License and give you a visible result that you can use to create valuable insights for your company (and, if you like, proudly show to your friends and coworkers).
First you will need to register for a Splunk.com account. This is the account that you will use if you decide to purchase a license later. Go ahead and do this now. From here on, the password you use for your Splunk.com account will be referred to as your Splunk.com password.
To obtain your Splunk.com account, perform the following steps:
Go to the Splunk signup page at http://www.splunk.com.
In the upper right hand corner, click on My Account | Sign Up.
Enter the information requested.
Create a username and password.
You will then need to download the Splunk Enterprise software. Go to http://download.splunk.com and select the Splunk Enterprise free download. Choose your operating system, being careful to select 32- or 64-bit (whichever is appropriate in your case; most should select 64-bit, which most computers today use). For Windows, download the
*.msi file. For Mac OS X, download the
*.dmg file. In this book, we will work with Version 6.4.1 or later.
The installation is very straightforward. Follow the steps for your particular operating system, whether it be Windows or Mac OS X.
These are the instructions you need to follow to install Splunk on your Windows desktop. Take your time and do not rush the installation. Many chapters in this book will rely on these steps:
Run the installer that you downloaded.
Check the box to accept the License Agreement and then click on Customize Options as shown in the following screenshot:
Change the installation path to
C:\Splunk. You will thank us later as it simplifies issuing Splunk CLI (command-line interface) commands. This is also a best practice used by modern Windows administrators. Remember to eliminate white spaces in directory names as well, as it causes complications with scripting. Click on Next to continue as seen in the following screenshot:
Install Splunk Enterprise as the Local System and then click on Next.
Leave the checkbox selected to Create Start Menu Shortcut.
Click on Install.
Wait for the installation to complete.
Click on Finish to complete the installation. It will attempt to launch Splunk for the first time in your default browser.
Launch the application the first time in your default browser. You can also manually access the Splunk web page via the
Splunk requires you to use a modern browser. It supports most versions of Google Chrome, Firefox, and newer versions of Internet Explorer. It may not support older versions of Internet Explorer.
Log in with the default username and password (admin : changeme) as indicated in the following screenshot:
The next step is to change the default administrator password, while keeping the default username. Do not skip this step. Make security an integral part of your day-to-day routine. Choose a password that will be secure:
Assuming all goes well, you will now see the default Splunk Search & Reporting dashboard:
You are finally ready to run your very first Splunk search query:
Go ahead and create your first Splunk search query. Click on the Search & Reporting app. You will be introduced to Splunk's very own internal index: this is Splunk's way of splunking itself (or collecting detailed information on all its underlying processes).
In the New Search input, type in the following search query (more about the Search Processing Language (SPL) in, Chapter 3, Search Processing Language):
SPL> index=_internal sourcetype=splunkd
SPL>prefix will be used as a convention in this book to indicate a
Searchcommand as opposed to the
C:\>prefix which indicates a Windows command.
The underscore before the index name
_internalmeans that it is a system index internally used by Splunk. Omitting the underscore will not yield any result, as internal is not a default index.
This search query will have as an output the raw events from the
metrics.logfile that is stored in the
_internalindex. A log file keeps track of every event that takes place in the system. The
_internalindex keeps track of every event that occurs and makes it easily accessible.
Take a look at these raw events, as shown in the following screenshot. You will see fields listed on the left side of the screen. The important Selected Fields are host, source, and sourcetype. We will go into more detail about these later, but suffice it to say that you will frequently search on one of these, as we have done here. As you can see from the highlighted fields, we indicated that we were looking for events where
sourcetype=splunkd. Underneath Selected Fields, you will see Interesting Fields. As you can tell, the purposes of many of these fields are easy to guess:
It is good practice to create a custom Splunk app to isolate all the changes you make in Splunk. You may never have created an app before, but you will quickly see it is not very difficult. Here we will create a basic app called Destinations that we will use throughout this book:
Let's access the Manage Apps page. There are two ways to do this; you may either click on the Apps icon at the home page as shown in the following screenshot:
Or select Manage Apps from the app dropdown in the top navigation bar of the Search & Reporting app:
At the Manage Apps page, click on the Create app icon as shown in the following screenshot:
Finally, populate the forms with the following information to complete the app creation. When you are done, click on the Save button to create your first Splunk app:
You have just created your very first Splunk app. Notice that it now appears in the list of apps and it has a status of Enabled, meaning it is ready to be used:
We will use this bare bones app to complete the exercises in this book, but first we need to make a few important changes:
Click the Permissions link as show in the preceding screenshot.
In the next window, under the Sharing for config file-only objects section, select All apps.
These steps will ensure that the application will be accessible to the Eventgen add-on that will be installed later in the chapter. Use the following screenshot as a guide:
Splunk permissions are always composed of three columns: Roles, Read, and Write. A role refers to certain authorizations or permissions that can be taken on by a user. Selecting Read for a particular role grants the set of users in the role permission to view the object. Selecting Write will allow the set of users to modify the object. In the preceding screenshot, everyone (all users) will have access to view the Destinations app, but only the admin (you) and a power user can modify it.
Machine data is the information produced by the many functions carried out by computers and other mechanical machines. If you work in an environment that is rich in machine data, you will most likely have many sources of readily-available machine inputs for Splunk. However, to facilitate learning in this book, we will use a Splunk add-on called the Splunk Eventgen to easily build real-time and randomized web log data. This is the type of data that would be produced by a web-based e-commerce company.
If you need more detailed information about Eventgen, you can follow the project's GitHub repository at https://github.com/splunk/eventgen/.
Here's an important tip. Make it a habit to always launch your command prompt in Administrator mode. This allows you to use commands that are unhindered by Windows security:
Right-click on the Windows Start menu icon and select Search. In Windows 7, you can click on the Windows icon and the search window will be directly above it. In Windows 10, there is a search bar named Cortana next to the Windows icon that you can type into. They both have the same underlying function.
In the search bar, type
In the search results, look for
command.exe(Windows 7) or a command prompt (Windows 10), right-click on it, then select Run as administrator.
A Splunk add-on extends and enhances the base functionality of Splunk. They also typically enrich data from source for easier analysis. In this section, you will be installing your first add-on called Splunk Eventgen that will help us pre-populate Splunk with real-time simulated web data:
First we need to install the Eventgen add-on. If you have Git (https://git-scm.com) installed on your machine, you may clone the entire project onto your machine with the following command:
C:\> git clone https://github.com/splunk/eventgen.git
You may also download the ZIP file from the Eventgen's public repository, http://github.com/splunk/eventgen, and extract it onto your machine. The download ZIP button is in the lower-right corner of the GitHub repository page.
After extracting the ZIP file, copy the entire
eventgendirectory into the
$SPLUNK_HOME/etc/apps/folder. You may need to rename it from
SA-EventGenif you manually downloaded the ZIP file. The trailing slashes are important. Now open an administrator command prompt and execute the following command:
C:\> xcopy eventgen c:\Splunk\etc\apps\SA-Eventgen /O /X /E /H /K
In the prompt, type D. Verify the contents of the folder using the following command:
C:\> dir c:\Splunk\etc\apps\SA-Eventgen
These are the contents of the recently-copied
SA-Eventgenfolder as shown in the following screenshot:
Restart Splunk by selecting the Settings dropdown, and under the SYSTEM section, click on Server controls:
On the Server controls page, click on the Restart Splunk button as shown in the following screenshot. Click OK when asked to confirm the restart:
The web interface will first notify you that Splunk is restarting in the background, then it will tell you that the restart has been successful. Every time Splunk is restarted, you will be prompted to log in with your credentials. Go ahead and log in.
Go to the Manage Apps page and confirm that the
SA-EventGenapplication is installed:
You have successfully installed a Splunk add-on.
There are several different ways to stop, start, or restart Splunk. The easiest way is to do it from the web interface, as demonstrated in the preceding section. The web interface, however, only allows you to restart your Splunk instance. It does not offer any other control options.
In Windows, you can also control Splunk through the Splunkd Service as shown in the following screenshot. The d in the service name, denoting daemon, means a background process. Note that the second service, splunkweb, is not running. Do not try to start splunkweb as it is deprecated and is only there for legacy purposes. The Splunk web application is now bundled in Splunkd Service:
The best way to control Splunk is by using the command-line interface (CLI). It may require a little effort to do it, but using the CLI is an essential skill to learn. Remember to always use command prompts in Administrator mode.
In the console or command prompt, type in the following command and hit Enter on your keyboard:
C:\> cd \Splunk\bin
cd is a command that means change directory.
While in the
C:\Splunk\bin directory, issue the following command to restart Splunk:
C:\> C:\Splunk\bin> splunk restart
After issuing this command, splunkd will go through its restart process. Here are the other basic parameters that you can pass to the Splunk application to control Splunk:
splunk status: Tells you if splunkd is running or not
splunk stop: Stops splunkd and all its processes
splunk start: Starts splunkd and all its processes
splunk restart: Restarts splunkd and all its processes
Doing this in the console gives the added benefit of verbose messages. A verbose message is a message with a lot of information in it. Such messages can be useful for making sure the system is working correctly or troubleshooting any errors.
A successful restart of splunkd has the following output (which may vary):
We are almost there. Proceed by first downloading the exercise materials that will be used in this book. Open an Administrator command prompt and make sure you are in the root of the
C: drive. If you are using Git, clone the entire project with this Git command:
C:\> git clone https://github.com/ericksond/splunk-essentials.git
You can alternatively just download the ZIP file and extract it in your computer using https://github.com/ericksond/splunk-essentials/archive/master.zip.
The Eventgen configuration you will need for the exercises in this book has been packaged and is ready to go. We are not going into the details of how to configure Eventgen. If you are interested in learning more about Eventgen, visit the project page at http://github.com/splunk/eventgen.
Follow these instructions to proceed:
Extract the project ZIP file into your local machine. Open an administrator console and CD into the directory where you extracted the file.
Create a new
samplesdirectory in the Destinations Splunk app. The path of this new directory will be
C:\> mkdir c:\splunk\etc\apps\destinations\samples
Copy all the
/labs/chapter01/eventgenof the extracted project directory into the newly-created
samplesdirectory. You can also copy and paste using the GUI if you prefer it:
C:\> copy splunk-essentials\labs\chapter01\eventgen\*.sample c:\Splunk\etc\apps\destinations\samples\
Now copy the
$SPLUNK_HOME/etc/apps/destinations/localdirectory. You can also copy and paste using the GUI if you prefer it:
C:\> copy splunk-essentials\labs\chapter01\eventgen\eventgen.conf c:\Splunk\etc\apps\destinations\local\
SYSTEMaccount full access permissions to the
eventgen.conffile. This is a very important step. You can either do it using the following
icaclscommand or change it using the Windows GUI:
C:\> icacls c:\Splunk\etc\apps\destinations\local\eventgen.conf /grant SYSTEM:F
A successful output of this command will look like this:
processed file: c:\Splunk\etc\apps\destinations\local\eventgen.conf Successfully processed 1 files; Failed processing 0 files
Next we will see our Destinations app in action! Remember that we have configured it to draw events from a prototype web company. That is what we did when we set it up to work with Eventgen. Now let's look at some of our data:
After a successful restart, log back in to Splunk and proceed to your new Destinations app:
In the Search field, type this search query and select Enter:
Examine the event data that your new app is enabling to come into Splunk. You will see a lot of references to browsers, systems, and so forth: the kinds of information that make a web-based e-commerce company run.
Try changing the time range to Real-time (5 minute window) to see the data flow in before your eyes:
Congratulations! You now have real-time web log data that we can use in subsequent chapters.
Now that we have data ingested, it is time to use it in order to derive something meaningful out of it. You are still in the Destinations app, correct? We will show you the basic routine when creating new dashboards and dashboard panels.
Copy and paste the following search query in the Search Field, then hit Enter:
SPL> index=main /booking/confirmation earliest=-24h@h | timechart count span=15m
After the search results render, click on the Visualization tab. This will switch your view into visualization so you can readily see how your data will look. By default, it should already be using the Column Chart as shown in the following screenshot. If it does not, then use the screenshot as a guide on how to set it:
Now that you can see your Column Chart, it is time to save it as a dashboard. Click on Save As in the upper-right corner of the page, then select Dashboard Panel as shown in the following screenshot:
Now let's fill up that dashboard panel information, as seen in the following screenshot. Make sure to select the Shared in App in the Dashboard Permissions section:
Finish up by clicking View Dashboard in the next prompt:
You have created your very first Splunk dashboard with a panel that tells you the number of confirmed bookings in the last 24 hours at 15-minute intervals. Time to show it to your boss!
Take that well-deserved coffee break. You now have a fully-functional Splunk installation with live data. Leave Splunk running for 2 hours or so. After a few hours, you can stop Splunk if you need to rest for a bit to suppress indexing and restart it when you're ready to proceed into the next chapters. Do you recall how to control Splunk from the command line?
C:\> C:\Splunk\bin> splunk stop C:\Splunk\bin> splunk start
In this chapter, you learned a number of basic Splunk concepts that you need to get started with this powerful tool. You learned how to install Splunk and configure a new Splunk app. You ran a simple search to ensure that the application is functional. You then installed a Splunk add-on called Eventgen, which you used to populate dummy data into Splunk in real time. You were shown how to control Splunk using the web user interface and the command-line interface. Finally, you created your very first Splunk dashboard. Now we will go on in Chapter 2, Bringing in Data, to learn more about how to input data.