Reader small image

You're reading from  Advanced Splunk

Product typeBook
Published inJun 2016
Publisher
ISBN-139781785884351
Edition1st Edition
Tools
Right arrow
Author (1)
Ashish Kumar Tulsiram Yadav
Ashish Kumar Tulsiram Yadav
author image
Ashish Kumar Tulsiram Yadav

Ashish Kumar Tulsiram Yadav is a BE in computers and has around four and a half years of experience in software development, data analytics, and information security, and around four years of experience in Splunk application development and administration. He has experience of creating Splunk applications and add-ons, managing Splunk deployments, machine learning using R and Python, and analytics and visualization using various tools, such as Tableau and QlikView. He is currently working with the information security operations team, handling the Splunk Enterprise security and cyber security of the organization. He has worked as a senior software engineer at Larsen & Toubro Technology Services in the telecom consumer electronics and semicon unit providing data analytics on a wide variety of domains, such as mobile devices, telecom infrastructure, embedded devices, Internet of Things (IOT), Machine to Machine (M2M), entertainment devices, and network and storage devices. He has also worked in the area of information, network, and cyber security in his previous organization. He has experience in OMA LWM2M for device management and remote monitoring of IOT and M2M devices and is well versed in big data and the Hadoop ecosystem. He is a passionate ethical hacker, security enthusiast, and Linux expert and has knowledge of Python, R, .NET, HTML5, CSS, and the C language. He is an avid blogger and writes about ethical hacking and cyber security on his blogs in his free time. He is a gadget freak and keeps on writing reviews on various gadgets he owns. He has participated in and has been a winner of hackathons, technical paper presentations, white papers, and so on.
Read more about Ashish Kumar Tulsiram Yadav

Right arrow

Chapter 5. Advanced Data Analytics

This chapter will take you through important advanced data analytics commands to create reports, detect anomalies, and correlate the data. You will also go through the commands for predicting, trending, and machine learning on Splunk. This chapter will illustrate with examples the usage of advanced analytics commands to be run on Splunk to get detailed insight on the data.

In this chapter, we will cover the following topics:

  • Reports

  • Geography and location

  • Anomalies

  • Prediction and trending

  • Correlation

  • Machine learning

Reports


You will now learn reporting commands that are used to format the data so that it can be visualized using various visualizations available on Splunk. Reporting commands are transforming commands that transform event data returned by searches in tables that can be used for visualizations.

The makecontinuous command

The Splunk command makecontinuous is used to make x-axis field continuous to plot it for visualization. This command adds empty buckets for the period where no data is available. Once the specified field is made continuous, then charts/stats/timechart commands can be used for graphical visualizations.

The syntax for the makecontinuous command is as follows:

… | makecontinous
    Field_name
    bin_options

The parameter description of the makecontinuous command is as follows:

  • Field_name: The name of the field that is to be plotted on the x axis can be specified.

  • Bin_Options: This parameter can be used to specify the options for discretization. This is a required parameter and...

Geography and location


Here, you will learn how we can add geographical information in the current dataset by referencing to the IP address, or if the data already has location information, then how that data can be made visualization ready on the world map.

The iplocation command

The Splunk iplocation command is a powerful command that extracts location information such as city, country, continent, latitude, longitude, region, zip code, time zone, and so on from the IP address. This command can be used to extract relevant geographic and location information, and those extracted fields can be used to filter and, create statistical analytics based on location information. Let's suppose we have data with IP addresses of users making transactions on the website. Using the iplocation command, we can find the exact location and analytics, such as the highest number of transactions done from which state or continent, or in a location an e-commerce site is more popular. Such kind of location-based...

Anomalies


Anomaly detection, also known as outlier detection, is a branch of data mining that deals with identification of events, items, observations, or patterns that do not comply to a set of expected events or patterns. Basically, a different (anomalous) behavior is a sign of an issue that could be arising in the given dataset. Splunk provides commands to detect anomalies in real time, and this can useful in detecting fraudulent transaction of bank credit cards, network and IT security frauds, hacking activity, and so on. Splunk has various commands that can be used to detect anomalies. There is also a Splunk app named Prelert Anomaly Detective App for Splunk on the app store. It can be used to mine the data for anomaly detection. The following commands can be either used to group similar events or to create a cluster of anomalous or outlier events.

The anomalies command

The anomalies Splunk command is used to detect the unexpectedness in the given data. This command assigns a score to...

Correlation


The following set of commands that belongs to the set of the Correlation category of Splunk is used to generate insight from the given dataset by correlating various data points from one or more data sources. In simple terms, correlation means a connection or relationship between two or more things. The set of commands includes associate, contingency, correlate, and so on.

The correlate command

The correlate Splunk command is used to calculate the correlation between different fields of the events. In simpler terms, it means that this command returns an output that shows what is the co-occurrence between different fields of the given dataset. Let's say I have a dataset that has information about web server failures. Then, using the correlate command, a user can find out whenever there is a failure what other field values have also occurred most of the time. So, insight can be generated to show that whenever X set of events occurs, Y also occurs, and hence, failures can be detected...

Machine learning


Machine learning is a branch of computer science that deals with pattern recognition to develop artificial intelligence. The intelligence thus studies and generates algorithms that can be used to make precise predictions on the given dataset. Machine learning can be used and implemented to analyze public interest from social media data and make pricing decisions using data-driven statistics for an e-commerce website. Thus, machine learning can be very useful to track the given data and reach to a conclusion for business decisions. Machine learning can be effectively implemented on data financial services, media, retail, pharmaceuticals, telecom, security, and so on.

You already know the Splunk commands for predicting and trending, but now, you will learn how machine learning can be effectively applied on the data using Splunk and apps from the Splunk app store.

The process of machine learning is explained in the following diagram:

The preceding flowchart can be explained and...

Summary


In this chapter, you learned about various advanced Splunk commands that can be used for reporting and visualizations. You also learned to detect anomalies, correlate data, and predict and trend commands. This chapter also explained about the Machine Learning Toolkit capabilities and how they can be used in implementing artificial intelligence (AI) for efficient prediction, thus enabling users to make informed business decisions well in advance. Next, you will learn about various visualizations available in Splunk and where and how they can be used to make data visualization more useful.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Splunk
Published in: Jun 2016Publisher: ISBN-13: 9781785884351
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ashish Kumar Tulsiram Yadav

Ashish Kumar Tulsiram Yadav is a BE in computers and has around four and a half years of experience in software development, data analytics, and information security, and around four years of experience in Splunk application development and administration. He has experience of creating Splunk applications and add-ons, managing Splunk deployments, machine learning using R and Python, and analytics and visualization using various tools, such as Tableau and QlikView. He is currently working with the information security operations team, handling the Splunk Enterprise security and cyber security of the organization. He has worked as a senior software engineer at Larsen & Toubro Technology Services in the telecom consumer electronics and semicon unit providing data analytics on a wide variety of domains, such as mobile devices, telecom infrastructure, embedded devices, Internet of Things (IOT), Machine to Machine (M2M), entertainment devices, and network and storage devices. He has also worked in the area of information, network, and cyber security in his previous organization. He has experience in OMA LWM2M for device management and remote monitoring of IOT and M2M devices and is well versed in big data and the Hadoop ecosystem. He is a passionate ethical hacker, security enthusiast, and Linux expert and has knowledge of Python, R, .NET, HTML5, CSS, and the C language. He is an avid blogger and writes about ethical hacking and cyber security on his blogs in his free time. He is a gadget freak and keeps on writing reviews on various gadgets he owns. He has participated in and has been a winner of hackathons, technical paper presentations, white papers, and so on.
Read more about Ashish Kumar Tulsiram Yadav