Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
SpamAssassin: A practical guide to integration and configuration

You're reading from  SpamAssassin: A practical guide to integration and configuration

Product type Book
Published in Sep 2004
Publisher Packt
ISBN-13 9781904811121
Pages 240 pages
Edition 1st Edition
Languages

Table of Contents (24) Chapters

SpamAssassin
Credits
About the Author
About the Reviewers
1. Introduction
1. Introducing Spam 2. Spam and Anti-Spam Techniques 3. Open Relays 4. Protecting Email Addresses 5. Detecting Spam 6. Installing SpamAssassin 7. Configuration Files 8. Using SpamAssassin 9. Bayesian Filtering 10. Look and Feel 11. Network Tests 12. Rules 13. Improving Filtering 14. Performance 15. Housekeeping and Reporting 16. Building an Anti-Spam Gateway 17. Email Clients 18. Choosing Other Spam Tools Glossary

Chapter 15. Housekeeping and Reporting

Once SpamAssassin is installed and configured, it operates well with little or no intervention. A busy system administrator will be keen to automate every aspect of system operations and make life easier for users. In this chapter, some further filters and regular scripts are described.

Separating Levels of Spam

Spam does not need to be saved on the server, except as a corpus for training the Bayesian database and for score regeneration. Generally, the reason that spam emails are stored is so that any false positives can be reclaimed by users. If auto-learning is used, you also can use these stored spam emails to ensure that false positives have not been learned as spam. This involves checking the folder of spam on a daily or weekly basis.

One technique to lower the number of spam emails to be examined is to divide them into two folders: one for high-scoring spam emails and another for comparatively low-scoring spam emails. False positives are unlikely...

Separating Levels of Spam


Spam does not need to be saved on the server, except as a corpus for training the Bayesian database and for score regeneration. Generally, the reason that spam emails are stored is so that any false positives can be reclaimed by users. If auto-learning is used, you also can use these stored spam emails to ensure that false positives have not been learned as spam. This involves checking the folder of spam on a daily or weekly basis.

One technique to lower the number of spam emails to be examined is to divide them into two folders: one for high-scoring spam emails and another for comparatively low-scoring spam emails. False positives are unlikely to be in the high scoring category, so the user need not examine emails in this folder.

This filtering can be effected using a Procmail recipe. The X-Spam-Level header contains a number of asterisks to indicate the score of the email. Emails that score between one and two get one asterisk, while emails that score between 12...

Detecting When SpamAssassin Fails


SpamAssassin in most circumstances is very reliable. When SpamAssassin is used as a daemon, email clients call spamc. If the spamd daemon is not running, then spamc will not tag email and spam emails would be delivered to the users' mailbox. If the email solution relies on SpamAssassin, then we should regularly confirm that spamd is running. One common reason for a service outage is that the daemon has stopped. Daemons can be tested by connecting to the port that they listen on. This involves writing a test client or using an existing client in test mode. This approach can be complex. Another solution is to simply test that the daemon is running among the processes on the system.

Large companies may use products like IBM's Tivoli or HP's OpenView for systems management, and these can be extended to watch the appropriate processes and send an alert in one of many ways. For smaller companies, the cost of a product like these is prohibitive. One inexpensive...

Spam and Ham Reports


There may be a need to supply statistics on email processing. This might be necessary to support any time invested on email administration. Alternatively, it may be desired to chart the trends in general email use and in the proportions of ham and spam.

One basic statistic is the number of emails processed. By subtotaling both ham and spam, and then creating per-user statistics, a good representation of the dynamics of the email in a corporation can be built up.

Another useful report is the length of time taken by SpamAssassin to process email. Apart from giving an immediate statistic on the delay of email due to spam processing, this report is useful for long term planning; if email processing is taking longer each month, it indicates that the load on the system is increasing and also suggests that additional resources may be required to improve the quality of services.

Spam Counter

A spam counter can be used to count the number of spam emails and the number of ham emails...

Summary


Spam email is usually archived for the purposes of creating a corpus for training a filter and identifying false positives. The user is required to manually go through spam messages to perform this sorting. One way of reducing this effort on the part of the user is to use automatic scripts and cron jobs to process and filter spam emails.

Separating spam into levels based on scores aid the user by presenting them with a smaller folder of spam to check for false positives. Spam reports can be generated by using similar scripts that parse the spam statistics that result from commands such as ps -ef. This is done by calling simple scripts from Procmail, or from running reports on system logs. These scripts and their results can be modified as required. For example, you can have site-wide and hourly reports. A little bit of an effort will relieve the system administrator and users from the considerable effort of sifting through large numbers of spam emails.

lock icon The rest of the chapter is locked
You have been reading a chapter from
SpamAssassin: A practical guide to integration and configuration
Published in: Sep 2004 Publisher: Packt ISBN-13: 9781904811121
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}