Linux Shell Script: Logging Tasks

Exclusive offer: get 50% off this eBook here
Linux Shell Scripting Cookbook

Linux Shell Scripting Cookbook — Save 50%

Solve real-world shell scripting problems with over 110 simple but incredibly effective recipes

$26.99    $13.50
by Sarath Lakshman | January 2011 | Linux Servers Open Source

In the previous article, Linux Shell Script: Monitoring Activities, we saw the different commands for monitoring different activities on the Linux system.

In this article by Sarath Lakshman, author of Linux Shell Scripting Cookbook, we will use a technique called logging by which important information is written to a file while the application is running. By reading this file, we can understand the timeline of the operations that are taking place with a particular software or a daemon. If an application or a service crashes, this information helps to debug the issue and enables us to fix any issues. Logging and monitoring also helps to gather information from a pool of data. Logging and monitoring are important tasks for ensuring security in the operating system and for debugging purposes. The recipes covered in this article are:

  • Information about logged users, boot logs, failure boots
  • Logging access to files and directories
  • Logfile management with logrotate
  • Logging with syslog
  • Monitoring user logins to find intruders
  • Finding out active user hours on a system

 

Linux Shell Scripting Cookbook

Linux Shell Scripting Cookbook

Solve real-world shell scripting problems with over 110 simple but incredibly effective recipes

  • Master the art of crafting one-liner command sequence to perform tasks such as text processing, digging data from files, and lot more
  • Practical problem solving techniques adherent to the latest Linux platform
  • Packed with easy-to-follow examples to exercise all the features of the Linux shell scripting language
  • Part of Packt's Cookbook series: Each recipe is a carefully organized sequence of instructions to complete the task as efficiently as possible
        Read more about this book      

Information about logged users, boot logs, and failure boot

Collecting information about the operating environment, logged in users, the time for which the computer has been powered on, and any boot failures are very helpful. This recipe will go through a few commands used to gather information about a live machine.

Getting ready

This recipe will introduce the commands who, w, users, uptime, last, and lastb.

How to do it...

To obtain information about users currently logged in to the machine use:

$ who
slynux   pts/0   2010-09-29 05:24 (slynuxs-macbook-pro.local)
slynux   tty7    2010-09-29 07:08 (:0)

Or:

$ w
 07:09:05 up  1:45,  2 users,  load average: 0.12, 0.06, 0.02
USER     TTY     FROM    LOGIN@   IDLE  JCPU PCPU WHAT
slynux   pts/0   slynuxs 05:24  0.00s  0.65s 0.11s sshd: slynux
slynux   tty7    :0      07:08  1:45m  3.28s 0.26s gnome-session

It will provide information about logged in users, the pseudo TTY used by the users, the command that is currently executing from the pseudo terminal, and the IP address from which the users have logged in. If it is localhost, it will show the hostname. who and w format outputs with slight difference. The w command provides more detail than who.

TTY is the device file associated with a text terminal. When a terminal is newly spawned by the user, a corresponding device is created in /dev/ (for example, /dev/pts/3). The device path for the current terminal can be found out by typing and executing the command tty.

In order to list the users currently logged in to the machine, use:

$ users
Slynux slynux slynux hacker

If a user has opened multiple pseudo terminals, it will show that many entries for the same user. In the above output, the user slynux has opened three pseudo terminals. The easiest way to print unique users is to use sort and uniq to filter as follows:

$ users | tr ' ' '\n' | sort | uniq
slynux
hacker

We have used tr to replace ' ' with '\n'. Then combination of sort and uniq will produce unique entries for each user.

In order to see how long the system has been powered on, use:

$ uptime
 21:44:33 up  3:17,  8 users,  load average: 0.09, 0.14, 0.09

The time that follows the word up indicates the time for which the system has been powered on. We can write a simple one-liner to extract the uptime only.

Load average in uptime's output is a parameter that indicates system load. In order to get information about previous boot and user logged sessions, use:

$ last
slynux tty7         :0              Tue Sep 28 18:27   still logged in
reboot system boot 2.6.32-21-generi Tue Sep 28 18:10 - 21:46 (03:35)
slynux pts/0      :0.0          Tue Sep 28 05:31 - crash (12:39)

The last command will provide information about logged in sessions. It is actually a log of system logins that consists of information such as tty from which it has logged in, login time, status, and so on.

The last command uses the log file /var/log/wtmp for input log data. It is also possible to explicitly specify the log file for the last command using the –f option. For example:

$ last -f /var/log/wtmp

In order to obtain info about login sessions for a single user, use:

$ last USER

Get information about reboot sessions as follows:

$ last reboot
reboot system boot 2.6.32-21-generi Tue Sep 28 18:10 - 21:48 (03:37)
reboot system boot 2.6.32-21-generi Tue Sep 28 05:14 - 21:48 (16:33)

In order to get information about failed user login sessions use:

# lastb
test     tty8    :0          Wed Dec 15 03:56 - 03:56  (00:00)   
slynux tty8    :0          Wed Dec 15 03:55 - 03:55  (00:00)

You should run lastb as the root user.

Logging access to files and directories

Logging of file and directory access is very helpful to keep track of changes that are happening to files and folders. This recipe will describe how to log user accesses.

Getting ready

The inotifywait command can be used to gather information about file accesses. It doesn't come by default with every Linux distro. You have to install the inotify-tools package by using a package manager. It also requires the Linux kernel to be compiled with inotify support. Most of the new GNU/Linux distributions come with inotify enabled in the kernel.

How to do it...

Let's walk through the shell script to monitor the directory access:

#/bin/bash
#Filename: watchdir.sh
#Description: Watch directory access
path=$1
#Provide path of directory or file as argument to script
inotifywait -m -r -e create,move,delete $path -q

A sample output is as follows:

$ ./watchdir.sh .
./ CREATE new
./ MOVED_FROM new
./ MOVED_TO news
./ DELETE news

How it works...

The previous script will log events create, move, and delete files and folders from the given path. The -m option is given for monitoring the changes continuously rather than going to exit after an event happens. -r is given for enabling a recursive watch the directories. -e specifies the list of events to be watched. -q is to reduce the verbose messages and print only required ones. This output can be redirected to a log file.

We can add or remove the event list. Important events available are as follows:

Logfile management with logrotate

Logfiles are essential components of a Linux system's maintenance. Logfiles help to keep track of events happening on different services on the system. This helps the sysadmin to debug issues and also provides statistics on events happening on the live machine. Management of logfiles is required because as time passes the size of a logfile gets bigger and bigger. Therefore, we use a technique called rotation to limit the size of the logfile and if the logfile reaches a size beyond the limit, it will strip the logfile and store the older entries from the logfile in an archive. Hence older logs can be stored and kept for future reference. Let's see how to rotate logs and store them.

Getting ready

logrotate is a command every Linux system admin should know. It helps to restrict the size of logfile to the given SIZE. In a logfile, the logger appends information to the log file. Hence the recent information appears at the bottom of the log file. logrotate will scan specific logfiles according to the configuration file. It will keep the last 100 kilobytes (for example, specified SIZE = 100k) from the logfile and move rest of the data (older log data) to a new file logfile_name.1 with older entries. When more entries occur in the logfile (logfile_name.1) and it exceeds the SIZE, it updates the logfile with recent entries and creates logfile_name.2 with older logs. This process can easily be configured with logrotate.logrotate can also compress the older logs as logfile_name.1.gz, logfile_name2.gz, and so on. The option for whether older log files are to be compressed or not is available with the logrotate configuration.

How to do it...

logrotate has the configuration directory at /etc/logrotate.d. If you look at this directory by listing contents, many other logfile configurations can be found.

We can write our custom configuration for our logfile (say /var/log/program.log) as follows:

$ cat /etc/logrotate.d/program
/var/log/program.log {
missingok
notifempty
size 30k
  compress
weekly
  rotate 5
create 0600 root root
}

Now the configuration is complete. /var/log/program.log in the configuration specifies the logfile path. It will archive old logs in the same directory path. Let's see what each of these parameters are:

The options specified in the table are optional; we can specify the required options only in the logrotate configuration file. There are numerous options available with logrotate. Please refer to the man pages (http://linux.die.net/man/8/logrotate) for more information on logrotate.

 

Linux Shell Scripting Cookbook Solve real-world shell scripting problems with over 110 simple but incredibly effective recipes
Published: January 2011
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:

 

        Read more about this book      

Logging with syslog

Logfiles are an important component of applications that provide services to the users. An applications writes status information to its logfile while it is running. If any crash occurs or we need to enquire some information about the service, we look into the logfile. You can find lots of logfiles related to different daemons and applications in the /var/log directory. It is the common directory for storing log files. If you read through a few lines of the logfiles, you can see that lines in the log are in a common format. In Linux, creating and writing log information to logfiles at /var/log are handled by a protocol called syslog. It is handled by the syslogd daemon. Every standard application makes use of syslog for logging information. In this recipe, we will discuss how to make use of syslogd for logging information from a shell script.

Getting ready

Logfiles are useful for helping you deduce what is going wrong with a system. Hence while writing critical applications, it is always a good practice to log the progress of application with messages into a logfile. We will learn the command logger to log into log files with syslogd. Before getting to know how to write into logfiles, let's go through a list of important logfiles used in Linux:

How to do it...

In order to log to the syslog file /var/log/messages use:

$ logger LOG_MESSAGE

For example:

$ logger This is a test log line
$ tail -n 1 /var/log/messages
Sep 29 07:47:44 slynux-laptop slynux: This is a test log line

The logfile /var/log/messages is a general purpose logfile. When the logger command is used, it logs to /var/log/messages by default. In order to log to the syslog with a specified tag, use:

$ logger -t TAG This is a message
$ tail -n 1 /var/log/messages
Sep 29 07:48:42 slynux-laptop TAG: This is a message

syslog handles a number of logfiles in /var/log. However, while logger sends a message, it uses the tag string to determine in which logfile it needs to be logged. syslogd decides to which file the log should be made by using the TAG associated with the log. You can see the tag strings and associated logfiles from the configuration files located in the /etc/rsyslog.d/ directory.

In order to log to the system log with the last line from another logfile use:

Monitoring user logins to find intruders

Logfiles can be used to gather details about the state of the system. Here is an interesting scripting problem statement:

We have a system connected to the Internet with SSH enabled. Many attackers are trying to log in to the system. We need to design an intrusion detection system by writing a shell script. Intruders are defined as users who are trying to log in with multiple attempts for more than two minutes and whose attempts are all failing. Such users are to be detected and a report should be generated with the following details:

  • User account to which a login is attempted
  • Number of attempts
  • IP address of the attacker
  • Host mapping for IP address
  • Time range for which login attempts are performed.

Getting started

We can write a shell script that can scan through the logfiles and gather the required information from them. Here, we are dealing with SSH login failures. The user authentication session log is written to the log file /var/log/auth.log. The script should scan the log file to detect the failure login attempts and perform different checks on the log to infer the data. We can use the host command to find out the host mapping from the IP address.

How to do it…

Let's write an intruder detection script that can generate a report of intruders by using the authentication logfile as follows:

$ logger -f /var/log/source.log

A sample output is as follows:

How it works…

In the intruder_detect.sh script, we use the auth.log file as input. We can either provide a log file as input to the script by using a command-line argument to the script or, by default, it reads the /var/log/auth.log file. We need to log details about login attempts for valid user names only. When a login attempt for an invalid user occurs, a log similar to Failed password for invalid user bob from 203.83.248.32 port 7016 ssh2 is logged to auth.log. Hence, we need to exclude all lines in the log file having the word "invalid". The grep command with the invert option (-v) is used to remove all logs corresponding to invalid users. The next step is to find out the list of users for which login attempts occurred and failed. The SSH will log lines similar to sshd[21197]: Failed password for bob1 from 203.83.248.32 port 50035 ssh2 for a failed password.

Hence we should find all the lines with words "failed password". Now all the unique IP addresses are to be found out for extracting all the log lines corresponding to each IP address. The list of IP address is extracted by using a regular expression for IP address and the egrep command. A for loop is used to iterate through IP address and the corresponding log lines are found using grep and are written to a temporary file. The sixth word from the last word in the log line is the user name (for example, bob1 ). The awk command is used to extract the sixth word from the last word. NF returns the column number of the last word. Therefore, NF-5 gives the column number of the sixth word from the last word. We use sort and uniq commands to produce a list of users without duplication.

Now we should collect the failed login log lines containing the name of each users. A for loop is used for reading the lines corresponding to each user and the lines are written to a temporary file. The first 16 characters in each of the log lines is the timestamp. The cut command is used to extract the timestamp. Once we have all the timestamps for failed login attempts for a user, we should check the difference in time between the first attempt and the last attempt. The first log line corresponds to the first attempt and last log line corresponds to last attempt. We have used head -1 to extract the first line and tail -1 to extract the last line. Now we have a time stamp for first (tstart) and last attempt (tends) in string format. Using the date command, we can convert the date in string representation to total seconds in UNIX Epoch time.

The variables start and end have a time in seconds corresponding to the start and end timestamps in the date string. Now, take the difference between them and check whether it exceeds two minutes (120 seconds). Thus, the particular user is termed as an intruder and the corresponding entry with details are to be produced as a log. IP addresses can be extracted from the log by using a regular expression for IP address and the egrep command. The number of attempts is the number of log lines for the user. The number of lines can be found out by using the wc command. The host name mapping can be extracted from the output of the host command by running with IP address as argument. The time range can be printed using the timestamp we extracted. Finally, the temporary files used in the script are removed.

The above script is aimed only at illustrating a model for scanning the log and producing a report from it. It has tried to make the script smaller and simpler to leave out the complexity. Hence it has few bugs. You can improve the script by using better logic.

Finding out active user hours on a system

Consider a web server with shared hosting. Many users log in to and log out of the server every day. The user activity gets logged in the server's system log. This recipe is a practice task to make use of the system logs and to find out how many hours each of the users have spent on the server and rank them according to the total usage hours. A report should be generated with the details, such as the rank, user, first logged in date, last logged in date, number of times logged in, and total usage hours. Let's see how we can approach this problem.

Getting ready

The last command is used to list the details about the login sessions of the users in a system. The log data is stored in the /var/log/wtmp file. By individually adding the session hours for each user we can find out the total usage hours.

How to do it…

Let's go through the script to find out active users and generate the report:

$ ./disklog.sh

A sample output is as follows:

#!/bin/bash
#Filename: active_users.sh
#Description: Reporting tool to find out active users
log=/var/log/wtmp
if [[ -n $1 ]];
then
  log=$1
fi

printf "%-4s %-10s %-10s %-6s %-8s\n" "Rank" "User" "Start"

"Logins" "Usage hours"
last -f $log | head -n -2 > /tmp/ulog.$$
cat /tmp/ulog.$$ | cut -d' ' -f1 | sort | uniq> /tmp/users.$$
(
while read user;
do
  grep ^$user /tmp/ulog.$$ > /tmp/user.$$
  seconds=0
while read t
 do
   s=$(date -d $t +%s 2> /dev/null)
   let seconds=seconds+s
 done< <(cat /tmp/user.$$ | awk '{ print $NF }' | tr -d ')(')
 firstlog=$(tail -n 1 /tmp/user.$$ | awk '{ print $5,$6 }')
 nlogins=$(cat /tmp/user.$$ | wc -l)
 hours=$(echo "$seconds / 60.0" | bc)
 printf "%-10s %-10s %-6s %-8s\n" $user "$firstlog" $nlogins $hours
done< /tmp/users.$$
) | sort -nrk 4 | awk '{ printf("%-4s %s\n", NR, $0) }'
rm /tmp/users.$$ /tmp/user.$$ /tmp/ulog.$$

How it works…

In the active_users.sh script, we can either provide the wtmp log file as a command-line argument or it will use the defaulwtmp log file. The last –f command is used to print the logfile contents. The first column in the logfile is the user name. By using cut we extract the first column from the logfile. Then the unique users are found out by using the sort and uniq commands. Now for each user, the log lines corresponding to their login sessions are found out using grep and are written to a temporary file. The last column in the last log is the duration for which the user logged a session. Hence in order to find out the total usage hours for a user, the session durations are to be added. The usage duration is in (HOUR:SEC) format and it is to be converted into seconds using the date command.

In order to extract the session hours for the users, we have used the awk command. For removing the parenthesis, tr –d is used. The list of usage hour string is passed to the standard input for the while loop using the <( COMMANDS ) operator. It acts as a file input. Each hour string, by using the date command, is converted into seconds and added to the variable seconds. The first login time for a user is in the last line and it is extracted. The number of login attempts is the number of log lines. In order to calculate the rank of each user according to the total usage hours, the data record is to be sorted in the descending order with usage hours as the key. For specifying the number reverse sort -nr option is used along with the sort command. –k4 is used to specify the key column (usage hour). Finally, the output of the sort is passed to awk. The awk command prefixes a line number to each of the lines, which becomes the rank for each user.

Summary

In this article we covered recipes on the logging techniques and their usage in the Linux system.

Linux Shell Scripting Cookbook Solve real-world shell scripting problems with over 110 simple but incredibly effective recipes
Published: January 2011
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:

About the Author :


Sarath Lakshman

Sarath Lakshman is a 23 year old who was bitten by the Linux bug during his teenage years. He is a software engineer working in ZCloud engineering group at Zynga, India. He is a life hacker who loves to explore innovations. He is a GNU/Linux enthusiast and hactivist of free and open source software. He spends most of his time hacking with computers and having fun with his great friends. Sarath is well known as the developer of SLYNUX (2005)—a user friendly GNU/Linux distribution for Linux newbies. The free and open source software projects he has contributed to are PiTiVi Video editor, SLYNUX GNU/Linux distro, Swathantra Malayalam Computing, School-Admin, Istanbul, and the Pardus Project. He has authored many articles for the Linux For You magazine on various domains of FOSS technologies. He had made a contribution to several different open source projects during his multiple Google Summer of Code projects. Currently, he is exploring his passion about scalable distributed systems in his spare time. Sarath can be reached via his website http://www.sarathlakshman.com.

Books From Packt


Linux Email: Set up and Run a Small Office Email Server
Linux Email: Set up and Run a Small Office Email Server

Linux Thin Client Networks Design and Deployment
Linux Thin Client Networks Design and Deployment

Linux Shell Scripting Cookbook
Linux Shell Scripting Cookbook

Scalix: Linux Administrator's Guide
Scalix: Linux Administrator's Guide

Linux Email: Set up and Run a Small Office Email Server
Linux Email: Set up and Run a Small Office Email Server

OpenVPN 2 Cookbook
OpenVPN 2 Cookbook

Getting started with Audacity 1.3
Getting started with Audacity 1.3

Hacking Vim 7.2
Hacking Vim 7.2


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software