Logstash Input Plugins

In this article by Saurabh Chhajed, author of the book Learning ELK Stack, he has covered Logstash input plugins. Logstash has a variety of plugins to help integrate it with a variety of input and output sources. Let's explore the various plugins available.

(For more resources related to this topic, see here.)

Listing all plugins in Logstash

You can execute the following command to list all available plugins in your installed Logstash version:

bin/plugin list

Also, you can list all plugins containing a name fragment by executing this command:

bin/plugin list <namefragment>

To list all plugins for group names, input, output, or filter, we can execute this command:

bin/plugin list --group <group name>
bin/plugin list --group output

Before exploring various plugin configurations, let's take a look at the data types and conditional expressions used in various Logstash configurations.

Data types for plugin properties

A Logstash plugin requires certain settings or properties to be set. Those properties have certain values that belong to one of the following important data types.

Array

An array is collection of values for a property.

An example can be seen as follows:

path => ["value1","value2"]

The => sign is the assignment operator that is used for all properties of configuration values in Logstash configuration.

Boolean

A boolean value is either true or false (without quotes).

An example can be seen as follows:

periodic_flush => false

Codec

Codec is actually not a data type but a way to encode or decode data at input or output.

An example can be seen as follows:

codec => "json"

This instance specifies that this codec, at output, will encode all output in JSON format.

Hash

Hash is basically a key value pair collection. It is specified as "key" => "value" and multiple values in a collection are separated by a space.

An example can be seen as follows:

match => {
"key1" => "value1" "key2" => "value2"}

String

String represents a sequence of characters enclosed in quotes.

An example can be seen as follows:

value => "Welcome to ELK"

Comments

Comments begin with the # character.

An example can be seen as follows:

#this represents a comment

Field references

Fields can be referred to using [field_name] or nested fields using [level1][level2].

Logstash conditionals

Logstash conditionals are used to filter events or log lines under certain conditions. Conditionals in Logstash are handled like other programming languages and work with if, if else and else statements. Multiple if else blocks can be nested.

Syntax for conditionals is as follows:

if <conditional expression1>{
#some statements here.
}
else if <conditional expression2>{
#some statements here.
}
else{
#some statements here.
}

Conditionals work with comparison operators, boolean operators and unary operators:

Comparison operators include:

  • Equality operators: ==, !=, <, >, <=, >=
  • Regular expressions: =~, !~
  • Inclusion: in, not in
  • Boolean operators include and, or, nand, xor
  • Unary operators include !

Let's take a look at this with an example:

filter {
if [action] == "login" {
   mutate { remove => "password" }
}
}

Multiple expressions can be specified in a single statement using boolean operators.

An example can be seen as follows:

output {
# Send Email on Production Errors
if [loglevel] == "ERROR" and [deployment] == "production" {
   email{

    }
} }

Types of Logstash plugins

The following are types of Logstash plugins:

  • Input
  • Filter
  • Output
  • Codec

Now let's take a look at some of the most important input, output, filter and codec plugins, which will be useful for building most of the log analysis pipeline use cases.

Input plugins

An input plugin is used to configure a set of events to be fed to Logstash. Some of the most important input plugins are:

file

The file plugin is used to stream events and log lines files to Logstash. It automatically detects file rotations, and reads from the point last read by it.

The Logstash file plugin maintains sincedb files to track the current positions in files being monitored. By default it writes sincedb files at $HOME/.sincedb*path. The location and frequency can be altered using sincedb_path and sincedb_write_interval properties of the plugin.

A most basic file configuration looks like this:

input{
file{
path => "/path/to/logfiles"
}

The only required configuration property is the path to the files. Let's look at how we can make use of some of the configuration properties of the file plugin to read different types of files.

Configuration options

The following configuration options are available for the file input plugin:

add_field

It is used to add a field to incoming events, its value type is Hash, and default value is {}.

Let's take the following instance as an example:

add_field => { "input_time" => "%{@timestamp}" }
codec

It is used to specify a codec, which can decode a specific type of input.

For example: codec => "json" is used to decode the json type of input.

The default value of codec is "plain".

delimiter

It is used to specify a delimiter, which identifies separate lines. By default, it is "\n".

exclude

To exclude certain types of files from the input path, the data type is array.

Let's take the following instance as an example:

path =>["/app/packtpub/logs/*"]
exclude => "*.gz"

This will exclude all gzip files from input.

path

This is the only required configuration for the file plugin. It specifies an array of path locations from where to read logs and events.

sincedb_path

It specifies the location where to write the sincedb files, which keeps track of the current position of files being monitored. The default is $HOME/.sincedb*

sincedb_write_interval

It specifies how often (number in seconds), the sincedb files that keep track of the current position of monitored files, are to be written. The default is 15 seconds.

start_position

It has two values: "beginning" and "end". It specifies where to start reading incoming files from. The default value is "end", as in most situations this is used for live streaming data. Although, if you are working on old data, it can be set to "beginning".

This option has impact only when a file is being read for the first time, called "first contact", as it maintains the location in the "sincedb" location. So for the next setting, this option has no impact unless you decide to remove the sincedb files.

tags

It specifies the array of tags that can be added to incoming events. Adding tags to your incoming events helps with processing later, when using conditionals. It is often helpful to tag certain data as "processed" and use those tags to decide a future course of action.

For example, if we specify "processed" in tags:

tags =>["processed"]

In filter, we can check in conditionals:

filter{
if "processed" in tags[]{

}
}
type

The type option is really helpful to process the different type of incoming streams using Logstash. You can configure multiple input paths for different type of events, just give a type name, and then you can filter them separately and process.

Let's take the following instance as an example:

input {
file{
path => ["var/log/syslog/*"]
type => "syslog"
}
file{
path => ["var/log/apache/*"]
type => "apache"
}
}

In filter, we can filter based on type:

filter {
if [type] == "syslog" {
grok {

}

}
if [type] == "apache" {
grok {

}
}
}

As in the preceding example, we have configured a separate type for incoming files; "syslog" and "apache". Later in filtering the stream, we can specify conditionals to filter based on this type.

stdin

The stdin plugin is used to stream events and log lines from standard input.

A basic configuration for stdin looks like this:

stdin {

}

When we configure stdin like this, whatever we type in the console will go as input to the Logstash event pipeline. This is mostly used as the first level of testing of configuration before plugging in the actual file or event input.

Configuration options

The following configuration options are available for the stdin input plugin:

add_field

The add_field configuration for stdin is the same as add_field in the file input plugin and is used for similar purposes.

codec

It is used to decode incoming data before passing it on to the data pipeline. The default value is "line".

tags

The tags configuration for stdin is the same as tags in the file input plugin and is used for similar purposes.

type

The type configuration for stdin is the same as type in the file input plugin and is used for similar purposes.

twitter

You may need to analyze a Twitter stream based on a topic of interest for various purposes, such as sentiment analysis, trending topics analysis, and so on. The twitter plugin is helpful to read events from the Twitter streaming API. This requires a consumer key, consumer secret, keyword, oauth token, and oauth token secret to work.

These details can be obtained by registering an application on the Twitter developer API page (https://dev.twitter.com/apps/new):

twitter {
   consumer_key => "your consumer key here"
   keywords => "keywords which you want to filter on streams"
   consumer_secret => "your consumer secret here"
   oauth_token => "your oauth token here"
   oauth_token_secret => "your oauth token secret here"
}
Configuration options

The following configuration options are available for the twitter input plugin:

add_field

The add_field configuration for the twitter plugin is the same as add_field in the file input plugin and is used for similar purposes.

codec

The codec configuration for twitter is the same as the codec plugin in the file input plugin and is used for similar purposes.

consumer_key

This is a required configuration with no default value. Its value can be obtained from the Twitter app registration page. Its value is the String type.

consumer_secret

The same as consumer_key, its value can be obtained from the Twitter dev app registration.

full_tweet

This is a boolean configuration with the default value; false. It specifies whether to record a full tweet object obtained from the Twitter streaming API.

keywords

This is an array type required configuration, with no default value. It specifies a set of keywords to track from the Twitter stream.

An example can be seen as follows:

keywords => ["elk","packtpub"]
oauth_token

The oauth_token option is also obtained from the Twitter dev API page.

After you get your consumer key and consumer secret, click on Create My Access Token to create your oauth token and oauth token secret.

oauth_token_secret

The oauth_token_secret option is obtained from the Twitter dev API page.

tags

The tags configuration for the twitter input plugin is the same as tags in the file input plugin and is used for similar purposes.

type

type configuration for twitter input plugins is the same as type in the file input plugin and is used for similar purposes.

lumberjack

The lumberjack plugin is useful to receive events via the lumberjack protocol that is used in Logstash forwarder.

The basic required configuration option for the lumberjack plugin looks like this:

lumberjack {
   port =>
   ssl_certificate =>
   ssl_key =>
}

Lumberjack or Logstash forwarder is a light weight log shipper used to ship log events from source systems. Logstash is quite a memory consuming process, so installing it on every node from where you want to ship data is not recommended. Logstash forwarder is a light weight version of Logstash, which provides low latency, secure and reliable transfer, and provides low resource usage.

More details about Lumberjack or Logstash forwarder can be found from here:
https://github.com/elastic/logstash-forwarder

Configuration options

The following configuration options are available for the lumberjack input plugin:

add_field

The add_field configuration for the lumberjack plugin is the same as add_field in the file input plugin and is used for similar purposes.

codec

The codec configuration for the lumberjack plugin is the same as the codec plugin in the file input plugin and is used for similar purposes.

host

It specifies the host on which to listen to. The default value: "0.0.0.0".

port

This is a number type required configuration and it specifies the port to listen to. There is no default value.

ssl_certificate

It specifies the path to the SSL certificate to be used for the connection. It is a required setting.

An example is as follows:

ssl_certificate => "/etc/ssl/logstash.pub"
ssl_key

It specifies the path to the SSL key that has to be used for the connection. It is also a required setting.

An example is as follows:

   ssl_key => "/etc/ssl/logstash.key"
ssl_key_passphrase

It specifies the SSL key passphrase that has to be used for the connection.

tags

The tags configuration for the lumberjack input plugin is the same as tags in the file input plugin and is used for similar purposes.

type

The type configuration for the lumberjack input plugins is the same as type in the file input plugin and is used for similar purposes.

redis

The redis plugin is used to read events and logs from the redis instance.

Redis is often used in ELK Stack as a broker for incoming log data from the Logstash forwarder, which helps to queue data until the time the indexer is ready to ingest logs. This helps to keep the system in check under heavy load.

The basic configuration of the redis input plugin looks like this:

redis {
}
Configuration options

The following configuration options are available for the redis input plugin:

add_field

The add_field configuration for redis is the same as add_field in the file input plugin and is used for similar purposes.

codec

The codec configuration for redis is the same as codec in the file input plugin and is used for similar purposes.

data_type

The data_type option can have a value as either "list", "channel" or "pattern_channel".

From the Logstash documentation for the redis plugin (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-redis.html):

"If redis_type is list, then we will BLPOP the key. If redis_type is channel, then we will SUBSCRIBE to the key. If redis_type is pattern_channel, then we will PSUBSCRIBE to the key."

While using redis on the consumer and publisher side, key and data_type should be the same on both sides.

host

It specifies the hostname of the redis server. The default value is "127.0.0.1".

key

It specifies the key for redis; "list" or "channel".

password

It is a password type configuration that specifies the password to be used for connection.

port

It specifies the port on which the redis instance is running. The default is 6379.

An extensive list and latest documentation on all available Logstash input plugins is available at https://www.elastic.co/guide/en/logstash/current/input-plugins.html.

Now that we have seen some of the most important input plugins for Logstash, let's have a look at some output plugins.

Summary

In this article, we saw various configuration options for Logstash input plugins.

Resources for Article:


Further resources on this subject:


You've been reading an excerpt of:

Learning ELK Stack

Explore Title