%
|
A literal % character.
|
sn
|
Unique sequence number per log line entry.
|
err_code
|
The ID of an error response served by Squid or a similar internal error identifier.
|
err_detail
|
Additional err_code dependent error information.
|
>a
|
Client's source IP address.
|
>A
|
Client's FQDN (Fully Qualified Domain Name).
|
>p
|
Client's source port.
|
<A
|
Server's IP address or peer name.
|
la
|
Local IP address of the Squid proxy server.
|
lp
|
Local port number on which Squid is listening.
|
<lp
|
Local port number of the last server or peer connection.
|
ts
|
Seconds since Unix epoch.
|
tu
|
Sub... |
Time for action – customizing the access log with a new log format
Squid has a lot of information about every client request and reply, however it writes only the requested information to the log file, which we can customize by defining several log formats.
Now, let's define a log format in which the time will appear in a human-readable format and use it with access_log
:
So, we have constructed a new log format that will log the information we are most interested in. Let's see a few log messages in the preceding format:
Now the time in the log messages is human-readable and we can therefore tell when a particular URL was accessed.
We should note that if we are using custom formats for access...
Selective logging of requests
Sometimes we may not want to log requests from certain clients. This could be because of several reasons. One reason may be that a team is working on a highly secret project and we don't want to leave any impressions of their browsing patterns anywhere.
Logging of requests can be controlled using two directives, namely, log_access
and access_log
. These directives may look confusing when used in the same sentence but we can interpret the meaning by the sequence in which the individual words appear in the directive name. The directive access_log
is used for controlling the format of the log messages and the location where the messages will be logged. While the directive log_access
is used to control whether a particular request should be logged or not.
We have already learned about the log_access
directive in the Log Access section in Chapter 2, Configuring Squid. Now, we will learn about using the access_log
directive to cache selective requests.
Time for action – using access_log to control logging of requests
As we have seen in a previous section of this chapter, the syntax of the access_log
directive is as follows:
So, here we have an option to specify ACL lists which we can use to control where the different requests will be logged, if at all. Let's consider a scenario where we don't want to log requests to Yahoo! servers and we do want to log requests to Google and Facebook servers to separate files, and all other requests go to the access log. This scenario can be realized with the following configuration:
If we look at the configuration carefully, we are denying...
When a client clicks a link to other.example.com
on the website example.com
, then the website example.com
is a referrer and the client is referred to the website other.example.com
. When a client is referred by a website, a HTTP header referer
is sent by the HTTP clients. Squid has the ability to log referer
HTTP headers, which can later be used for analyzing traffic patterns.
Note
"Referer" is actually a misspelling of the word "Referrer", but it has been officially specified that way in HTTP RFCs.
Time for action – enabling the referer log
By default, there is no referer log. We can enable the referer log using the access_log
directive in combination with a custom log format. To generate the referer log, first of all, we need to create a log format as shown:
This configuration defines a new log format called referer
, which contains a request timestamp, IP address of the client, the referer URL, and the request URL. Now, we need to use the access_log
directive with the aforementioned constructed log format as shown:
Now, let's look at a few lines from the referer log file:
The referer log is a bit easier to understand. The first column is the time elapsed since epoch...
Time for action – translating the referer logs to a human-readable format
We can translate a referer log to a human-readable format by using the command line utility awk
. We can convert the entire referer.log
file to a human-readable format by using the following command sequence:
The log messages from referer.log
, as shown, should look like the following messages after conversion:
The command we saw before works fine for the conversion of the entire log file, but is not useful if we want to see the live referer log with human-readable timestamps. For achieving this, we can use the following command:
All requests from clients generally contain the User-Agent
HTTP header, which is basically a formatted string describing the HTTP client being used for the current request. As Squid knows everything about the requests, it can log this HTTP header field to the log file defined by the useragent_log
directive in the Squid configuration file.
Time for action – enabling user agent logging
By default, the user agent log is disabled and we can enable it by using the following line in our configuration file:
Once we have the user agent log enabled, Squid will start logging the User-Agent
HTTP header field from the requests, depending on the availability of the field. Let's see a few lines from an example user agent log:
The format of this file is quite simple and only the last column, representing the user agent, is of interest here. The user agent log can be used to analyze the popular web browsers on a network.
We learned to enable logging of the User-Agent
HTTP header field from...
Emulating HTTP server-like logs
Squid has an optional feature that can help in generating log messages similar to messages generated for most HTTP servers. We can use the access_log
directive to log messages with the log format common
.
Time for action – enabling HTTP server log emulation
By default, Squid will generate a native log, which contains more information than the logs generated with the HTTP log emulation on. We can use the following line in our configuration line:
This configuration will log messages in a web server-like format. Let's have a look at a few log messages in the HTTP server-like log format:
These log messages are similar to log messages generated by the famous open source web server Apache and many others.
We learned to switch on the HTTP server-like log emulation of Squid access logs. Squid...
As time passes, the size of the log files increases rapidly and starts occupying more and more disk space. To overcome this problem of the accumulation of logs over time, we generally keep the logs for the previous one or two weeks. To remove old log messages and retain the recent ones, Squid has a built-in feature of log file rotation, which can move older log messages to separate files. Moreover, Squid stores the incremental copy of the storage index in a file swap.state
, which is also pruned down during log rotation.
To rotate logs, we have to use the squid
command as follows:
This command will rotate logs depending on the value specified with the directive logfile_rotate
in the configuration file. The default value of logfile_rotate
is 10. This means that 10 older versions of all log files will be retained.
Have a go hero – rotate log files
Try to rotate log files on your proxy server and see how the log files are renamed.
Other log related features
We discussed important logging related directives in the previous sections. Squid has more directives related to logging, but they are less important and we should not have any problems in operating Squid normally, even if we are not aware of these features.
If we have disk caching enabled on our proxy server, Squid can log its entire disk caching related activities to a separate log file whose location is determined by the directive cache_store_log
. This log file, contains information about the web objects being cached on the disk, stale objects being removed from the cache, and how long an object was in the cache. The information logged in this file is not particularly user-friendly. By default, logging of storage activity is disabled.
Consider the following configuration line:
Which log format will be used by Squid in accordance with the previous configuration?
common
squid
combined
squidmime...
In this chapter, we have learned to interpret several log files generated by Squid. We had a detailed look at the format codes that Squid uses to construct log messages and how we can construct custom log formats depending on the requirements.
Specifically, we understood cache log, debugged messages generated by Squid, and had a detailed overview of access log and format codes. We customized log messages using several log formats and selectively logged requests to various log files, and enabled the referer and user agent log messages.
We also discussed about rotating log files to prevent unnecessary wastage of disk space.
Now that we have learned about the various log files and log messages, we will go on to learn about using these messages to monitor our proxy server and analyze the performance of our cache, in the next chapter.