Testing anti-spam products is a challenging task. Collecting a large variety of spam and forwarding it to multiple accounts
is a simple way to have the same test data for all the products tested, but it makes the test much less effective, since most
of the products look for the sender and the sender’s IP address as major clues as to whether the message is spam. Also, some
products update their detection algorithms in real time, so identifying older spam is less challenging than a live stream.
It’s also important to have real mail coming in, both personal messages and mailing lists, which many of the products have
a hard time distinguishing from spam.
Therefore, to test my six anti-spam solutions, I used four separate e-mail accounts on a Microsoft Exchange server, each receiving
a mix of real live mail -- personal messages, e-mail newsletters, messages from PR people regarding new products, and lots
of spam -- via SMTP. This enabled me to test four products simultaneously. Although each account received different e-mails,
the overall numbers of messages were similar on all four accounts, as were percentages of spam out of the total.
The mix of messages was a difficult one for the anti-spam filters. For example, I receive a lot of press releases by e-mail.
The characteristics of these messages are similar in many ways to marketing spam, which makes it hard for the filters to distinguish
among them, both because of the verbiage and the fact that they are often distributed by bulk e-mailers. Likewise, newsletters,
both technical ones such as those offered by InfoWorld and opt-in marketing information, can trigger the filters. Because personal e-mail addresses on America Online, MSN, Juno,
Yahoo, and other large providers often contain a group of characters followed by numbers, which is also true of typical spammer
e-mail addresses, spam filters often block mail from these sources. These filters may also block messages from friends or
family who send pictures of the kids or who use cute HTML e-mail backgrounds.