Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
SpamAssassin: A practical guide to integration and configuration

You're reading from  SpamAssassin: A practical guide to integration and configuration

Product type Book
Published in Sep 2004
Publisher Packt
ISBN-13 9781904811121
Pages 240 pages
Edition 1st Edition
Languages

Table of Contents (24) Chapters

SpamAssassin
Credits
About the Author
About the Reviewers
1. Introduction
1. Introducing Spam 2. Spam and Anti-Spam Techniques 3. Open Relays 4. Protecting Email Addresses 5. Detecting Spam 6. Installing SpamAssassin 7. Configuration Files 8. Using SpamAssassin 9. Bayesian Filtering 10. Look and Feel 11. Network Tests 12. Rules 13. Improving Filtering 14. Performance 15. Housekeeping and Reporting 16. Building an Anti-Spam Gateway 17. Email Clients 18. Choosing Other Spam Tools Glossary

Chapter 11. Network Tests

SpamAssassin on its own can detect a high proportion of spam. By using network tests, spam detection can be further improved. SpamAssassin includes support for Realtime BlockLists (RBLs) and Spam URI Realtime BlockLists (SURBLs). All these external services are easy to integrate into SpamAssassin.

The effectiveness of network tests varies from a 60% detection rate upwards. By using them in conjunction with SpamAssassin, spam detection rates are much higher, typically over 95%! However, network tests slow down spam detection. This means that the SpamAssassin processes will take longer to complete and will increase the memory usage of the email server.

This chapter describes the support SpamAssassin has for RBLs and SURBLs, and focuses on three external services:

  • Vipul's Razor

  • Pyzor

  • The Distributed Checksum Clearinghouse (DCC)

RBLs are blocklists of known sources of spam. By default, SpamAssassin uses a number of RBLs to check the source of the email.

A SURBL is a blocklist...

RBLs


A number of RBLs are enabled with the default configuration of SpamAssassin. These are defined in /usr/share/spamassassin/20_dnsbl_tests.cf. An example definition is shown here:

header RCVD_IN_NJABL eval:check_rbl('njabl', 'dnsbl.njabl.org.')
describe RCVD_IN_NJABL Received via a relay in dnsbl.njabl.org
tflags RCVD_IN_NJABL net

One set of definitions appears for each RBL configured. Rule definitions are explained in more detail in Chapter 12.

All the rules include a line that sets tflags to net. This groups the rules as network tests, and allows SpamAssassin to treat them as a group. There are two main reasons for this. The first is that network tests may take a long time to complete, especially at busy times. SpamAssassin uses a timeout for network tests, but it also applies this timeout in a progressive manner. If most of the network tests have completed, SpamAssassin will not wait for the last tests to complete. Specific details are given in the Mail::SpamAssassin::Conf main page...

SURBLs


Spam URI Realtime BlockLists are a relatively recent technique and SpamAssassin 3.0 supports a relatively small number of SURBLs. SURBLs are configured much like RBLs. SpamAssassin 2.63 can use a different plug-in, described later. Details on SURBLs can be found at http://www.surbl.org.

The SURBLs are defined in /usr/share/spamassassin/25_uribl.cf. An example definition is shown below:

uridnsbl URIBL_SBL sbl.spamhaus.org. TXT
header URIBL_SBL eval:check_uridnsbl('URIBL_SBL')
describe URIBL_SBL Contains a URL listed in the SBL blocklist
tflags URIBL_SBL net

One set of definitions appears for each SURBL configured.

As with RBLs, the SURBL rules set the tflags to net, to enable timeouts to be used, and to enable the rules to be switched on and off together.

SURBLs are implemented as a SpamAssassin plug-in. Plug-ins allow SpamAssassin to be extended with new types of tests and rules without changing SpamAssassin itself. To be enabled, the plug-in must be loaded. On SpamAssassin version 3...

Vipul's Razor


Perl is required to install and use Vipul's Razor. This will already be installed as SpamAssassin uses it. A C compiler is also required, except on Debian Linux, for which a binary package is available.

To operate, Razor requires a constant internet connection. The Razor communication uses TCP port 2703, and Razor also uses TCP pings on port 7 to determine which servers are closest, so firewalls will have to be configured to enable these ports.

Installing Razor

There are no mainstream RPM packages available for Razor. However, Razor is available in Gentoo and Debian Linux. To install in Gentoo, use the emerge razor command, and to install in Debian, use apt-get razor. On other Linux distributions and UNIX variants it can be installed from source.

Razor is available for download from the home page at http://razor.sourceforge.net/. Razor is not available via CPAN. Two packages are available, razor-agents and razor-agents-sdk. Both packages should be downloaded. The razor-agents...

Pyzor


Pyzor is written in Python, and so the Python language needs be installed. This is included with most modern Linux distributions and is available for other operating systems including AIX, Solaris, and HP/UX. Pyzor source is packaged in a tar.bz2 file, using the bzip2 compression scheme. A bunzip2 program is required, and is installed on most Linux distributions. Binary bunzip2 utilities for other UNIX-like operating systems can be downloaded from the Internet.

Note

Pyzor uses TCP port 24441 for communicating with a server, so any firewall must be configured to allow outgoing connections on that port.

Installing Pyzor

Pyzor is available in RPM format only for Mandrake Linux. The rpm -i command can be used to install the RPM once it is downloaded. Packages are also available for Gentoo Linux and Debian Linux. Use emerge pyzor or apt-get pyzor respectively.

For all other distributions and operating systems, Pyzor should be installed from source. Pyzor can be downloaded from the Pyzor website...

DCC


Although the correct term is 'The Distributed Checksum Clearinghouse', it is referred to as DCC here to enhance readability. DCC is the most effective network service, but also the most complex.

DCC is written in C. To build from source (binary packages are rare) a C compiler is required. DCC uses UDP port 6277 to communicate with servers, so this should be enabled through any firewall in use.

Installing DCC

DCC is available in RPM format for Mandrake, but for no other RPM-based distribution. Use the rpm -i command to install it. DCC is available in Gentoo Linux and Debian Linux; use emerge net-mail/dcc under Gentoo, and apt-get dcc-client in Debian. For other distributions and versions of UNIX, DCC should be installed from source.

The source for DCC can be downloaded from http://www.dcc-servers.net/dcc/.

The source is packaged as a tar file. Unpack this and then run the configure script. This script will automatically detect any required software libraries or inform if they are missing...

Spamtraps


A spamtrap is an email address that has never been associated with a real person role in a company. The spamtrap is placed on web pages in such a way that it can only be picked up by spammer web spiders. When email is received at the spamtrap address, it can only be spam, and so the email can be sent to the Razor network as definite spam.

Normally a spamtrap is hidden from view by using a tiny font, by hiding the email address behind another element of the page, by using the same color for the text and the background, or by another technique). The spammer's web spider will nevertheless detect the email address and add it to its database of valid email addresses.

Spamtraps can also be added to postings on Usenet, as long as it is made clear that the email address should not be used for real replies.

Choosing a Spamtrap Address

A spamtrap address should be made of completely random characters. Using an address such as info@domain.com, contact@domain.com, or other popular generic addresses...

Summary


Network tests allow a site to benefit from other sites reporting email relays and spam-advertised websites. SpamAssassin includes support for RBLs and SURBLs. The latter provide a promising new technique against spam, which works by detecting the URIs that are advertised in spam emails. RBLs, Razor, Pyzor, and DCC are email comparison systems. DCC is considered the most effective. These tests can be used together and most settings are configurable on a site-wise or per-user basis.

lock icon The rest of the chapter is locked
You have been reading a chapter from
SpamAssassin: A practical guide to integration and configuration
Published in: Sep 2004 Publisher: Packt ISBN-13: 9781904811121
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}