Friday, March 13, 2009

In praise of SpamSieve

I get a ton of spam, and it was starting to overwhelm the filter built in to the OS X mail client. I've recently started running my own mail server, so I started tweaking the settings to try to cut down on the processed breakfast meat, and discovered that Microsoft Outlook has a serious bug (what a surprise). One of the ways an SMTP server can cut down on spam is to require clients to connect with a fully qualified domain name (FQDN). This will prevent many botnet machines from connecting because they tend to not be configured to send a FQDN. Unfortunately, it works a little too well. As far as I have been able to determine, it is not possible to configure Microsoft Outlook on Windows to send an FQDN. I have clients using this server who use outlook, so I had to disable the FQDN requirement.

I could have installed a content-based spam filter on the server, but the problem is that spam is personal. One man's spam is another man's hot deal of the week from his favorite on-line vendor, and the training process can get really annoying if it's not integrated into the client. I was about to throw up my hands in despair and set up a second mail server for Microsoft users, when I decided instead to try SpamSieve. I was a little skeptical that it would work much better than Apple's built-in filter, but there's a 30-day free trial so I didn't have much to lose.

I've become a believer. I've only been using (which is to say training) it for three days, but the situation is already dramatically better than it was before. I've had one false positive during the training process, and the false negatives have rapidly dwindled to 1-2 a day. My spam problem is (I almost dare not say this for fear of tempting fate) effectively solved.

So I recommend SpamSieve. It's well worth the $30 it costs.

I do have one complaint: the instructions for the initial training process are a little unclear, and in one place downright misleading. The instructions say to delete the messages in your current spam folder before setting up SpamSieve. DON'T DO THIS! The training process requires about 600 representative spam messages. The instructions suggest fishing these out of your trash, but the problem is that if your spam is in the trash it's almost certainly mixed in with a lot of good messages and now you have to manually tease them apart. So if you're going to be using SpamSieve, keep a collection of spam in a separate folder for training before you begin. You'll be glad you did.


Unknown said...

I suggest Gmail, you can read it with you preferred mail client, it's free, the server CPU will thank you, and has a gorgeous spam filter.
Can't ask more.

Ron said...

GMail is a fine product, but I'm a do-it-yourself kind of person. I'm also fairly paranoid about security, and GMail is an awfully high-profile target.

But the real reason I'm doing this is that I have a plan to offer email as part of a suite of integrated services, so I don't want to rely on a third party if I don't have to.

C-Command Software said...

There are two reasons that the SpamSieve instructions say to delete the contents of the Junk mailbox:

1. When you turn off Mail's junk filter, it hides the Junk mailbox, so the messages will not be visible for you to train SpamSieve with them. (You also won't be able to delete them, so they'll just sit on your hard drive.)

2. If you drag the spam messages from the Junk mailbox to a mailbox other than the trash, Mail takes that to mean that the messages are not spam. This will mess up the training for the built-in junk filter (if you ever decide to stop using SpamSieve), and it will add the messages' addresses to the "Previous Recipients" list, which will mess up address auto-completion, among other things.

So I still recommend deleting the spam messages, but you'll probably want to empty your trash first so that they don't get mixed in with the trashed good messages.