Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
SpamAssassin: A practical guide to integration and configuration

You're reading from  SpamAssassin: A practical guide to integration and configuration

Product type Book
Published in Sep 2004
Publisher Packt
ISBN-13 9781904811121
Pages 240 pages
Edition 1st Edition
Languages

Table of Contents (24) Chapters

SpamAssassin
Credits
About the Author
About the Reviewers
1. Introduction
1. Introducing Spam 2. Spam and Anti-Spam Techniques 3. Open Relays 4. Protecting Email Addresses 5. Detecting Spam 6. Installing SpamAssassin 7. Configuration Files 8. Using SpamAssassin 9. Bayesian Filtering 10. Look and Feel 11. Network Tests 12. Rules 13. Improving Filtering 14. Performance 15. Housekeeping and Reporting 16. Building an Anti-Spam Gateway 17. Email Clients 18. Choosing Other Spam Tools Glossary

Chapter 13. Improving Filtering

SpamAssassin has a high spam detection rate, but despite this, some spam emails always escape detection. Conversely, legitimate emails are sometimes marked as spam.

This chapter looks at whitelists and blacklists—techniques for spam filtering that mark known good and bad senders. We then discuss the situation where emails have been wrongly classified, and how to resolve this by altering scoring on rules. Finally, we discuss filtering out certain foreign languages and character sets as a method of reducing spam.

Whitelists and Blacklists

SpamAssassin works very well at detecting spam, but there is always a risk of false positives or false negatives. By using a list of email addresses that are known spam producers (a blacklist), email from spammers who use consistently use the same email addresses or domains can be filtered out. With a list of email addresses that are legitimate email senders (a whitelist), emails from regular or important correspondents are guaranteed...

Whitelists and Blacklists


SpamAssassin works very well at detecting spam, but there is always a risk of false positives or false negatives. By using a list of email addresses that are known spam producers (a blacklist), email from spammers who use consistently use the same email addresses or domains can be filtered out. With a list of email addresses that are legitimate email senders (a whitelist), emails from regular or important correspondents are guaranteed to be filtered as ham. This prevents the delay or non-delivery of important emails that may otherwise be marked as spam.

Blacklists that list individual emails have limited use—spammers normally use different or random email addresses for each spam run. However, some spammers use the same domain for multiple runs. As SpamAssassin allows wildcards in its blacklisting, entire domains can be blacklisted. This is more useful for filtering out spam.

SpamAssassin uses a manual blacklist and whitelist, and also manages an automatic whitelist...

The Auto-Whitelist


SpamAssassin manages an automatic whitelist (AWL). It actually functions as both an auto-blacklist and an auto-whitelist. Generally, an auto-blacklist is ineffective as spammers rarely use the same email address for any period of time. However, SpamAssassin tracks both the IP address of email sources and the email addresses used, adding to its effectiveness.

The auto-whitelist keeps a record of the SpamAssassin scores for emails from senders. Senders that only send ham emails receive a weighting towards ham by the auto-whitelist. If they later send an email marked as spam, then the SpamAssassin score for the new email will be adjusted downwards, due to their past behavior as a source of only ham emails. The converse is true of those who normally send spam.

The AWL works by adjusting the score of the email being processed towards the average of all previous emails received from that sender. The amount or strength of this adjustment can be altered using the auto_whitelist_factor...

Resolving Incorrect Classifications


The consequences of an email being wrongly classified can range from a minor inconvenience to a major catastrophe. If a spam email is marked as ham, then the recipient will only spend a few seconds removing it from their inbox. If an unimportant ham email is marked as spam, it may be no great problem. However, marking an important email as spam could be embarrassing at the least, and could well result in serious consequences, such as financial loss.

Consequently, it is worthwhile making a backup of emails marked as spam before purging them. Emails compress well, so it may be possible to keep this backup on local disk storage rather than tape or other removable media. When it becomes apparent that an email has gone missing, the archive of spam can be searched and the email retrieved.

Due to the potential cost of an incorrect classification, users should be encouraged to review any spam they have received on a regular basis. Any false positives should be brought...

Character Sets and Languages


SpamAssassin can detect certain languages and character sets. Both language and character set information are added by email clients when emails are composed and sent, so that the receiving email client can display the message correctly. There are many languages and character sets in use. If received messages are expected or known to use only some of them, then the others can be filtered out.

Disallowing Languages

SpamAssassin detects languages by using email headers. There is a large list of languages that SpamAssassin can detect; these are listed in the documentation for Mail::SpamAssassin::Conf. Use the man or perldoc commands to view the documentation:

$ perldoc Mail::SpamAssassin::Conf
$ man Mail::SpamAssassin::Conf

Many man and perldoc implementations use the / key to search for text. Once the page is displayed, enter /ok_languages to locate the correct part in the documentation. Press the space bar to scroll forward through the documentation.

Once the languages...

Summary


SpamAssassin allows the user and system administrator to improve detection of spam using a number of techniques. Whitelists aid in preventing false-positives. SpamAssassin's auto-whitelist will prevent an occasional spam-like email from an otherwise spam-free correspondent from being filtered as spam.

The spam threshold of whitelists can be altered to reduce false-positive or false-negative emails, and individual rule scores can be altered to prevent incorrect classifications. Character sets and languages can be used to filter out spam from other countries or regions.

lock icon The rest of the chapter is locked
You have been reading a chapter from
SpamAssassin: A practical guide to integration and configuration
Published in: Sep 2004 Publisher: Packt ISBN-13: 9781904811121
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}