Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
SpamAssassin: A practical guide to integration and configuration

You're reading from  SpamAssassin: A practical guide to integration and configuration

Product type Book
Published in Sep 2004
Publisher Packt
ISBN-13 9781904811121
Pages 240 pages
Edition 1st Edition
Languages

Table of Contents (24) Chapters

SpamAssassin
Credits
About the Author
About the Reviewers
1. Introduction
1. Introducing Spam 2. Spam and Anti-Spam Techniques 3. Open Relays 4. Protecting Email Addresses 5. Detecting Spam 6. Installing SpamAssassin 7. Configuration Files 8. Using SpamAssassin 9. Bayesian Filtering 10. Look and Feel 11. Network Tests 12. Rules 13. Improving Filtering 14. Performance 15. Housekeeping and Reporting 16. Building an Anti-Spam Gateway 17. Email Clients 18. Choosing Other Spam Tools Glossary

Chapter 12. Rules

Rules are the building blocks of SpamAssassin. Every test performed on a message is based upon a rule, with an associated score. Rules were covered briefly in Chapter 7.

User-definable rules are based on a Perl regular expression, also called a regex. Some knowledge of regexes is required to write new rules. There are many good sources of information on Perl regular expressions. The standard Perl distribution contains a quick-start document on regular expressions, a longer regular expressions tutorial, and a syntax definition. To access these, use the perldoc command:

$ perldoc perlrequick
$ perldoc perlretut
$ perldoc perlre

Other sources of regular expression material include most beginners' books on Perl. An Internet search for 'Perl regular expressions tutorial' will bring up many suitable pages. No prior knowledge of Perl regexes is assumed in this book.

There are several different types of rules in SpamAssassin:

Writing Rules


The least complex rules are the body and header rules. Meta rules are more complex and are described later in the chapter.

All rules must implement a Perl regex. If a rule is defined, it will be run unless its score is set to 0. The default score for a rule is 1.0. Rules beginning with T_ are test rules, and SpamAssassin gives a default score of 0.01 to these. Rule names should be 22 characters or less. By convention, rule names are in uppercase.

A rule must also have a description. The describe configuration directive is used for this.

Rules should be placed in a file with the extension .cf and placed in /etc/mail/spamassassin. Rules can only be defined for a user if allow_local_rules is set in /etc/mail/spamassassin/local.cf:

allow_local_rules 1

User-defined rules are placed in ~/.spamassassin/user_prefs. Rules can be developed using a user account. Once a rule is tested and scored, it can be moved to the site-wide configuration.

Rules can be written to search for single words...

Using Other Rulesets


Many other rulesets exist and are published on the Internet. These are often themed, for example, detecting invalid HTML, common URLs used in spam, or catching sequences of characters often used in spam emails. Custom rulesets are often updated frequently in response to changes in spam being sent.

An Internet search for 'spamassassin rulesets' will return many pages linking to rulesets. The SpamAssassin wiki (a collaborative information website) includes links to many custom rulesets on http://wiki.apache.org/spamassassin/CustomRulesets.

Custom rulesets may have specific installation instructions that should be read and followed. In general, the installation involves copying a rule file and a score file into /etc/mail/spamassasin/ and restarting spamd.

The more rules SpamAssassin uses, the more the resources used to process each email. System performance may degrade if too many custom rulesets are used. Performance issues are covered in Chapter 14.

Summary


This chapter discussed the building blocks of SpamAssassin—rules. SpamAssassin allows the user to define rules to respond to the spam a site is currently receiving. There are a variety of rule types for processing different parts of the email.

User-defined rules are based on Perl regular expressions. Rule scoring is an important part of SpamAssassin filtering techniques.

Using a corpus and calculating the effectiveness of rules can assist in re-scoring rules to improve filtering. Several custom rulesets can be added to a site, and these should be frequently updated.

lock icon The rest of the chapter is locked
You have been reading a chapter from
SpamAssassin: A practical guide to integration and configuration
Published in: Sep 2004 Publisher: Packt ISBN-13: 9781904811121
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}

Rule Type

Description

body

A rule that searches...