Chapter 10. Advanced Data Routing
Data routing is something that is becoming more common place in an enterprise. As many people are using big data platforms like Splunk to move data around their network things such as firewalls and data stream loss, sourcetype renaming by environment can become administratively expensive. There are some easier ways to get data to another data center or a different environment leveraging only Splunk and some of its more advanced features. We will delve a little deeper into the architecture in this chapter in order to subvert some firewalls, and some license restrictions.
In this chapter, we will learn about:
At enterprise level it is rare to deal with a distributed deployment as opposed to a clustered deployment (and depending on the scale of your systems, the cluster and Disaster Recovery (DR) / High Availability (HA) components of Splunk will be pretty large). It's usually a good idea to use DNS addresses, hardware load balancing, and clustering (both search tier and indexing tier clusters) in order to meet all of the enterprise level DR/HA policies. In an enterprise level network, there are plenty of security restrictions that won't allow data to flow freely to Splunk from one source or another, and in this case, I am going to attempt to give some insight and an example of what has been used previously and does work in order to distribute data to different environments within an enterprise deployment. There are far too many aspects to Splunk architecture to cover in a single chapter, so I will use those that are relevant to the concept of a data router.
To understand the challenges that face us with data routing, we will familiarize ourselves with the different network segments that exist within an enterprise network for the life cycle of software. We will use the idealized version of network segmentation, as this gives us the most complete view, though rarely do all of these exist together at an enterprise. There will be a semblance of these network segments at each enterprise, though due to policies and different cultures, these segments can vary in existence, as well as name. The reason why these are important is because each segment is usually protected by a series of firewall rules. Sometimes these rules can bend, sometimes they can break, other times they are immovable objects. These rules pose challenges to getting Splunk data from the forwarders to the indexers.
For those of you unfamiliar with what a network segment is, a network segment is usually an IP address space, a VLAN, or a series of both that all machines...
The Splunk data router is more of a concept than an actual thing. This is basically just a series of heavy forwarders sitting in a global location (preferably a DMZ) that route data to either a single indexer cluster or a series of them depending on your license. I have used the data router successfully in a previous life and it allows developers and security, as well as auditors, a single place to order data from.
I use the word order because you can literally make what I call a menu (which is a list of the data types) and allow different departments to pick what data they want. Just be sure to get approval by leadership, for security reasons.
The following diagram is a realistic representation of how the network segments that we spoke of earlier have a relationship with each other:
As you can see in the preceding diagram, the DMZ is a great place to put the data router, so we will use this for our example.
Let's assume each of these segments has 200+ forwarders in each of...
In this book, you've been able to learn about some techniques that are moderate and complicated to implement, though all of them can save a Splunk administrator time. Many of these techniques have been used at both small and large companies, as well as enterprise and government facilities from dev-ops to security.
My hope for you is that you glean something that is useful to your day-to-day activities, and leverage it to succeed the way only you know you can. There's a lot of good information within this book, from dashboards, to searching, to advanced data routing, and data model powered panels.
All of these are separate, yet when you pick up these techniques the way you pick up a wrench from a workbench and implement them, you will have many more tools in your belt, to help you look like a rock-star to the next person who asks for the next impossible thing.
There's a lot of assumptions that are made in this book about the skill level of the reader and because of that it may seem like...