Reader small image

You're reading from  Splunk Essentials - Second Edition

Product typeBook
Published inSep 2016
Publisher
ISBN-139781785889462
Edition2nd Edition
Tools
Right arrow
Authors (3):
Betsy Page Sigman
Betsy Page Sigman
author image
Betsy Page Sigman

Betsy Page Sigman is a distinguished professor at the McDonough School of Business at Georgetown University in Washington, D.C. She has taught courses in statistics, project management, databases, and electronic commerce for the last 16 years, and has been recognized with awards for teaching and service. She has also worked at George Mason University in the past. Her recent publications include a Harvard Business case study and a Harvard Business review article. Additionally, she is a frequent media commentator on technological issues and big data.
Read more about Betsy Page Sigman

Somesh Soni
Somesh Soni
author image
Somesh Soni

Somesh Soni is a Splunk Consultant with over 11 years of IT experience. He has bachelor degree in Computer Science (Hons.) and has been a interested in exploring and learning new technologies throughout his whole life. He has extensive experience in Consulting, Architecture, Administration and Development in Splunk. He's proficient in various programming languages and tools including C#.NET/VB.NET, SSIS, and SQL Server etc. Somesh is currently working as a Splunk Master with Randstad Technologies. His activities are focused on Consulting, Implementation, Admin, Architecture and support related activities for Splunk. He started his career with the one of the Top 3 Indian IT giant He has executed projects for major fortune 500 companies like Coca-Cola, Wells Fargo, Microsoft, Capital Group etc. He has performed in various capacities of Technical Architect, Technical Lead, Onsite Coordinator, Technology Analyst etc. Somesh has been a great contributor in the Splunk Community work and has consistently been on the top of the list. He is a member of Splunk Trust 2015-16 and overall one of the topmost contributor to Splunk Answers community. Acknowledgement: I would like to thank my family and colleagues who have always encouraged and supported me to follow my dreams, my friends who put up with all my crazy antics while I went on a Splunk exploratory Journey and listened with patience on all the tips and tricks of Splunk which I shared with them. Last but not the least I would like to express my gratitude to the entire team of Packt Publishing Ltd for giving me this opportunity.
Read more about Somesh Soni

Erickson Delgado
Erickson Delgado
author image
Erickson Delgado

Erickson Delgado is an enterprise architect who loves to mine and analyze data. He began using Splunk in version 4.0 and has pioneered the use of the application in his current work. In the earlier parts of his career, he worked with start-up companies in the Philippines to help build their open source infrastructure. He then worked in the cruise industry as a shipboard IT manager, and he loved it. From there, he was recruited to work at the company's headquarters as a software engineer.
Read more about Erickson Delgado

View More author details
Right arrow

Chapter 9. Best Practices and Advanced Queries

As we bring this book to a close, we want to leave you with a few extra skills in your Splunk toolkit. Throughout the book, you have gained the essential skills required to use Splunk effectively. In this chapter, we will look at some best practices that you can incorporate in your daily Splunk work. These include the following:

  • Temporary indexes and oneshot indexing

  • Searching within an index

  • Searching within a limited time frame

  • How to do quick searches via fast mode

  • How to use event sampling

  • Using the universal forwarder

We will also list some advanced SPL queries that you can use as templates when the need arises. These include:

  • Doing a subsearch, or a search within a search

  • Using append and join

  • Using eval with if

  • Using eval with match

Throughout this book, we have seen how logs can be used to improve applications and to troubleshoot problems. Since logs are such an important component of using data with Splunk, we end the chapter with a few basics...

Temporary indexes and oneshot indexing


When you need to index new data and you are unfamiliar with its format, it is always best practice to use a temporary index. You should begin by creating a temporary index just for this purpose. Once you have this temporary index, you can use a Splunk command to add the file once. This process is called  oneshot indexing. This is crucial when you know you have to transform the data prior to indexing, for instance when using props.conf and transforms.conf. A nice feature of oneshot indexing is that there is no need for any kind of configuration before uploading.

Here is how you perform oneshot indexing using the CLI:

C:\> c:\splunk\bin\splunk add oneshot TestFile.log -index TempIndex - 
     sourcetype TempSourceType

You can also do this from the UI by going to Settings | Data inputs | Files and Directories | Add new. Then browse for the file and click on Index Once.

These methods will only work when Splunk is stopped. It will warn you if it is...

Searching within an index


Always remember to filter your searches by index. By doing so, you can dramatically speed up your searches. If you don't restrict your search to a specific index, it means Splunk has to go through all available indexes and execute the search against them, thus consuming unnecessary time.

When designing your Splunk implementation, partitioning of indexes is also very crucial. Careful thought needs to be taken when planning for the indexes and their partitioning. In my experience, it is best to create an index for every type of source included in your incoming data.

For example, all web server logs for the same application should be placed in one index. You may then split the log types by source type, but keep them within the same index. This will give you a generally favorable search speed even if you have to search between two different source types.

Here are some examples:

Search within a limited time frame


By default, the Search and Reporting app's time range is set to All Time. Searches done using this time frame will have a negative performance impact on your Splunk instance. This is heightened when there are concurrent users doing the same thing. Although you can train your users to always select a limited time range, not everybody will remember to do this.

The solution for this problem is fairly simple. You can simply change the default time range for the drop-down menu. We will do this by modifying the ui-prefs.conf file in an administrative command prompt.

Go ahead and execute the following command:

C:\> notepad c:\Splunk\etc\system\local\ui-prefs.conf

Copy and paste the following into the file:

[search] 
dispatch.earliest_time = -4h 
dispatch.latest_time = now 
          
[default] 
dispatch.earliest_time = -4h 
dispatch.latest_time = now 

Save the file and restart Splunk. Go back to the Search and Reporting app and...

Quick searches via fast mode


There are three types of searching available for Splunk: Fast Mode, Smart Mode, and Verbose Mode:

If you want your searches to be faster, use Fast Mode. Fast mode will not attempt to generate fields during search time, unlike the default smart mode. This is very good to use when you do not know what you are looking for. Smart Mode looks for transforming commands in your searches. If it finds these, it acts like fast mode; if it doesn't, then it acts like verbose mode. Verbose Mode means that the search will provide as much information as possible, even though this may result in significantly slower searches.

Using event sampling


New to version 6.4 is event sampling. Like the fact that you only need a drop of blood to test for the amount of sugar and sodium levels in your blood, you often only need a small amount of data from your dataset to make conclusions about that dataset. The addition of event sampling to the Splunk toolset is particularly useful, because there is often so much data available and what you are really seeking is to take measurements from that data quickly:

Event sampling uses a sample ratio value that reduces the number of results. If a typical search result returns 1,000 events, a 1:10 event sampling ratio will return 100 events. As you can see from the previous screenshot, these ratios can significantly cut the amount of data indexed, and can range from a fairly large ratio (which can be set using the Custom setting) to one as small as 1:100,000 (or even smaller, again using the Custom setting).

This is not suitable for saved searches for which you need accurate counts. This...

Splunk Universal Forwarders


Although detailed descriptions of Splunk Universal Forwarders will not be part of this book, it is good to mention that on large-scale Splunk implementations, data gathering should, as much as possible, be done using these. Their usefulness lies in the fact that they are lightweight applications that can run on many different operating systems and can quickly and easily forward data to the Splunk indexer.

Throughout this book, we have indexed files locally on your machine. In production environments, with many different types of deployment and using many different machines, each machine where data resides will have a Universal Forwarder.

When the implementation is large and includes many different machines, Universal Forwarders can be managed using the forwarder manager.

These forwarders and the ability to manage them easily are one of the reasons for Splunk's growing popularity. Sizeable organizations find it much easier to be able to bring in, understand, and use...

Advanced queries


There are various kinds of advanced query that you may want to consider as you plan out how you will create searches and dashboards for your data. Consider the ones that we present, for they will help you design queries that are more efficient and cost effective.

Subsearch

A subsearch is a search within a search. If your main search requires data as a result of another search, then you can use Splunk's subsearch capability to achieve it. Say you want to find statistics about the server that generates the most 500 errors. A 500 error is a general HTTP status code that means that something has gone wrong with the server, but it doesn't know specifically at this particular time what is wrong. Obviously, if you are responsible for running a website, you want to pay close attention to 500 errors and where they are coming from. You can achieve your goal of finding the culprit server with two searches.

The first search, shown next, will return the server address with the most 500...

How to improve logs


Throughout this book, we have seen examples of how logs can be used to make applications more effective. We have also talked about how logs can be used to troubleshoot problems. In this last section, we will discuss some basics, recommended by Splunk that should be considered when creating logs.

Including clear key-value pairs

It is important to remember that data should be structured using clear key-value pairs. Doing so will help Splunk carry out automatic field-extraction in the way it is intended to and will do so in a faster and more efficient manner. Remember that we are talking about one of the most useful features of Splunk!

A model for doing this is shown here:

key1=value1, key2=value2, . . . etc. 

As you do this, remember that if it is important to include spaces in the values, in text fields, for example, you should surround the value with quotes:

key1="value1" or user="Matt Nguyen" 

Although you may find this method is lengthier and more verbose, it conveys...

Summary


In this chapter, you have learned some best practices to employ when using Splunk. You were also shown complex queries that can further enhance your result set.

This brings our book to a close. We hope that you have enjoyed this adventure with Splunk. If have completed (or even mostly completed) the steps in this book, we can assure you that you should now have a strong knowledge of this important software. Splunk appears, as we write, to be growing more and more successful in the marketplace. It is positioned to become even more important as the Internet of Things (IoT) continues its growing influence on the daily lives of individuals and businesses. Splunk is a skill that will help you as you navigate the exciting world of data and all the advantages it will bring to our future.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Splunk Essentials - Second Edition
Published in: Sep 2016Publisher: ISBN-13: 9781785889462
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Betsy Page Sigman

Betsy Page Sigman is a distinguished professor at the McDonough School of Business at Georgetown University in Washington, D.C. She has taught courses in statistics, project management, databases, and electronic commerce for the last 16 years, and has been recognized with awards for teaching and service. She has also worked at George Mason University in the past. Her recent publications include a Harvard Business case study and a Harvard Business review article. Additionally, she is a frequent media commentator on technological issues and big data.
Read more about Betsy Page Sigman

author image
Somesh Soni

Somesh Soni is a Splunk Consultant with over 11 years of IT experience. He has bachelor degree in Computer Science (Hons.) and has been a interested in exploring and learning new technologies throughout his whole life. He has extensive experience in Consulting, Architecture, Administration and Development in Splunk. He's proficient in various programming languages and tools including C#.NET/VB.NET, SSIS, and SQL Server etc. Somesh is currently working as a Splunk Master with Randstad Technologies. His activities are focused on Consulting, Implementation, Admin, Architecture and support related activities for Splunk. He started his career with the one of the Top 3 Indian IT giant He has executed projects for major fortune 500 companies like Coca-Cola, Wells Fargo, Microsoft, Capital Group etc. He has performed in various capacities of Technical Architect, Technical Lead, Onsite Coordinator, Technology Analyst etc. Somesh has been a great contributor in the Splunk Community work and has consistently been on the top of the list. He is a member of Splunk Trust 2015-16 and overall one of the topmost contributor to Splunk Answers community. Acknowledgement: I would like to thank my family and colleagues who have always encouraged and supported me to follow my dreams, my friends who put up with all my crazy antics while I went on a Splunk exploratory Journey and listened with patience on all the tips and tricks of Splunk which I shared with them. Last but not the least I would like to express my gratitude to the entire team of Packt Publishing Ltd for giving me this opportunity.
Read more about Somesh Soni

author image
Erickson Delgado

Erickson Delgado is an enterprise architect who loves to mine and analyze data. He began using Splunk in version 4.0 and has pioneered the use of the application in his current work. In the earlier parts of his career, he worked with start-up companies in the Philippines to help build their open source infrastructure. He then worked in the cruise industry as a shipboard IT manager, and he loved it. From there, he was recruited to work at the company's headquarters as a software engineer.
Read more about Erickson Delgado

Index name

Source type

App1

Logs.Error

App1

Logs.Info

App1

Logs.Warning

App2

Logs.Error...