In this chapter, we will work through a few advanced search examples in great detail. The examples and data shown are fictitious, but hopefully will spark some ideas that you can apply to your own data. For a huge collection of examples and help topics, check out Splunk answers at http://answers.splunk.com.
The transaction
command lets you group events based on their proximity to other events. This proximity is determined either by ranges of time, or by specifying the text contained in the first and/or last event in a transaction. This is an expensive process, but is sometimes the best way to group certain events. Unlike other transforming commands, when using transaction
, the original events are maintained and instead are grouped together into multivalued events.
Some rules of thumb for the usage of transaction
are as follows:
If the question can be answered using
stats
, it will almost always be more efficient.All of the events needed for the transaction have to be found in one search.
When grouping is based on field values, and all of the events need at least one field in common with at least one other event, then it can be considered as part of the transaction. This doesn't mean that every event must have the same field, but that all events should have some field from the...
Determining the number of users currently using a system is difficult, particularly if the log does not contain events for both the beginning and end of a transaction. With web server logs in particular, it is not quite possible to know when a user has left a site. Let's investigate a couple of strategies for answering this question.
If the question you are trying to answer is "how many transactions were happening at a time?", you can use transaction
to combine related events and calculate the duration of each transaction. We will then use the concurrency
command to increase a counter when the events start, and decrease when the time has expired for each transaction. Let's start with our searches from the previous section:
sourcetype="impl_splunk_web" | transaction maxpause=5m uid
This will return a transaction for every uid
, assuming that if no requests were made for five minutes, the session is complete. This provides results as...
There are a number of ways to calculate events per some period of time. All of these techniques rely on rounding _time
down to some period of time, and then grouping the results by the rounded "buckets" of _time
.
The simplest approach to count events over time is simply to use timechart
, like this:
sourcetype=impl_splunk_gen | timechart span=1m count
In table view, we see:
Looking at a 24-hour period, we are presented with 1,440 rows, one per minute.
Note
Charts in Splunk do not attempt to show more points than the pixels present on the screen. The user is instead expected to change the number of points to graph, using the bins
or span
attributes. Calculating average events per minute, per hour shows another way of dealing with this behavior.
If we only wanted to know about minutes that actually had events, instead of every minute of the day, we could use bucket
and stats
, like this:
sourcetype=impl_splunk_gen | bucket span=1m _time | stats...
The top
command is very simple to use, but is actually doing a fair amount of interesting work. I often start with top
, then switch to stats count
, but then wish for something that top
provides automatically. This exercise will show you how to recreate all of the elements, so that you might pick and choose what you need.
Let's recreate the top
command by using other commands.
Here is the query that we will replicate:
sourcetype="impl_splunk_gen" error | top useother=t limit=5 logger user
The output looks like this:
To build count
, we can use stats
like this:
sourcetype="impl_splunk_gen" error
| stats count by logger user
This gets us most of the way to our end goal:
To calculate the percentage that top
includes, we will first need the total number of events. The eventstats
command lets us add statistics to every row, without replacing the rows.
sourcetype="impl_splunk_gen" error
| stats count by logger user
| eventstats sum(count) as totalcount
I hope this chapter was enlightening, and has sparked some ideas that you can apply to your own data. As stated in the introduction, Splunk Answers (http://answers.splunk.com) is a fantastic place to find examples and general help. You can ask your questions there, and contribute answers back to the community.
In the next chapter, we will use more advanced features of Splunk to help extend the search language, and enrich data at search time.