C-Command Software Forum

How long do it take SpamSieve to learn spam?

I installed SpamSieve 2.9.20b4 in Apple Mail 8.2 (2100) on my MacBook Pro running Apple’s OS X Yosemite 10.10.4 public beta 4 on a trial basis several days ago. SpamSieve has generated 572 Black List rules and 387 White List rules so far but just today I have had another 263 spam messages which I had to train as spam. Many of the spam messages today have subjects identical to previous spam and/or have arrived from the same domain with different auto-generated user names. I looked at the Blocked List and nearly every entry is an exact match rather than a contains match. Since spammers send out thousands of spam emails using the same domain but auto-generated user names, most of the 263 spam messages today did not match anything in the Blocked List even though they were from the same domain. Is there some setting I missed which would get SpamSieve to block a domain once several spam emails have been sent and trained? It would take a tremendous amount of my time for me to inspect the black list and add domains by hand. So:

Can I get SpamSieve to automatically learn to block domains and not always create rules which only matched an auto-generated user?

Thanks, Jim

There is no setting to automatically make blocklist rules for domains because SpamSieve’s Bayesian classifier already learns from domains automatically. So such rules should not be necessary, and they could actually be very dangerous. If lots of spam messages are getting through, the answer is not to block by domain but to figure out why the normal filtering mechanisms are not working. To do that, please see the Why is SpamSieve not catching my spam? page.

Spam still isn’t being filtered
I am testing SpamSieve because I am getting an incredible amount of spam daily and changing my email address would be complicated to accomplish. I thought finding a way to filter out the spam would make my email’s mailboxes manageable. I know that SpamSieve has had many recommendations from important computer people just the latest I saw was Ric Ford of MacInTouch fame.

For the last 6 days, I have had SpamSieve 2.9.19 and now 2.9.20 installed on my MacBook Pro running OS X Yosemite 10.10.4 public beta 4 and Mail 8.2 (2100) with the SpamSieve Mail plugin 1.8.2. I have checked, double checked and today triple checked that everything is installed and configured properly following your directions.

So far today, I have had over 200 spam emails, none filtered out, which made it into either the Spam mailbox on my Gmail account or the Bulk mailbox on my Yahoo account. There is no filtering being done at Google or Yahoo. I have a Smart Mailbox which displays spam from both accounts together so it is easier for me to delete and now train SpamSieve. The 200+ spam emails I previously mentioned have all been displayed in the Smart Mailbox and I selected all of them then chose to Train SpamSieve to Blocklist them.

The latest statistics are:

Filtered Mail
34 Good Messages
20 Spam Messages (37%)
3 Spam Messages Per Day

SpamSieve Accuracy
0 False Positives
0 False Negatives
100.0% Correct

Corpus
676 Good Messages
936 Spam Messages (58%)
77,259 Total Words

Rules
1,165 Blocklist Rules
395 Whitelist Rules

Showing Statistics Since
5/30/15, 11:47 AM

What I notice is the vast majority of items in the Blocklist are defined as Header Is equal To Text to Match. Of the 1,165 Blocklist entries, 14 use Matches Regex (1.2%), 1 uses Starts With (0.09%), 6 use Contains (0.086%), 2 use Ends With (0.17%) and 1142 use Is Equal to (98%). Here is a typical list of From (address) emails which are defined using Equal To:

53xygal8a7b6@2vtli.chardiffer.com
53xygal8a7b6@89re4.instration.com
53xygal8a7b6@j3mne.easterest.com
53xygal8a7b6@oprge.journame.com
63ygdt098yhj@2dqjy.ssl-certificate526592409.com
63ygdt098yhj@9j9dt.ssl-certificate922824775.com.com
63ygdtr6t@62x10.ssl-certificate282501549.com
63ygdtr6t@e9ygg.ssl-certificate373901334.com
63ygdtr6t@qf3u1.ssl-certificate282501549.com
6yrtu78@wkg1t.ssl-certificate588076551.com
6yrtu78@xbk11.ssl-certificate626012757.com
76tf67u4ew@jrscy.ssl-certificate142919911.com
76tf67u@x3sm5.ssl-certificate318566462.com
76tfghu76t@vm5lq.ssl-certificate831139508.com

Note the obviously faked email names and addresses yet I see there are some similarities. None of the 1,165 rules have any hits after being defined so none of the rules that have been auto defined filtered out any spam. My Spam and Bulk mailboxes have accumulated hundreds of emails a day without being filtered out.

Just during the time I have been writing this posting, another 13 spams have arrived. The From addresses are similar but not identical. But all the links in the Bodies have the same 2 domain names. I very much want SpamSieve to work for me but I cannot see any positive results so far. So again:

  1. How long does it take SpamSieve to figure out how to define rules filtering out spam?
  2. What else could I have done wrong?
  3. Does SpamSieve look at links in the Body of emails when trying to define Blocklist entries?
  4. Is there anything else I could do to speed up the process of defining items in the Blocklist which will filter out the hundreds of spam emails I get daily?
  5. Why does SpamSieve leave some emails behind rather than moving them to the Spam mailbox on my Mac when it is training SpamSieve? The emails get marked as read but are not moved.

I would appreciate any guidance anyone has for me. Thanks,

Jim

I don’t understand what you mean here. If the spam is going to the server spam mailboxes, why do you say that no filtering is being done at Google or Yahoo?

Secondly, with the normal setup, SpamSieve filters new messages that arrive in your inbox. It does not go through and filter out spam that was already filtered out into a server spam mailbox.

You don’t need to train SpamSieve with spam that was already caught by another filter. Just train it with the mistakes that got through SpamSieve.

This is showing that Mail only sent 54 incoming messages to SpamSieve for analysis, and that SpamSieve classified them all correctly. If you think there should have been more than 54 incoming messages, then there may be something wrong with the rules in Apple Mail. If you send in the files via e-mail, I can check them for you.

It will usually start catching most of your spam on the first day.

It looks like either your rules are not set up properly in Mail or that you are expecting SpamSieve to filter messages that are arriving already read or in a mailbox other than the inbox.

No, but it does look in the body when training the Bayesian classifier.

The blocklist is not the problem. The problem is that SpamSieve is not even looking at most of the messages that you are expecting it to filter.

When training a message as spam, SpamSieve only moves it if it’s not already in a Spam mailbox.

I have let SpamSieve do its thing for about 2 weeks now and I am getting a lot less spam in my Inbox. If I understand you, both Google and Yahoo filter spam even though I have no special rules set there. Then those messages will not get filtered or sent to the SpamSieve Spam folder on my Mac since they did not make it to my Inbox. I have been selecting them and training them as Spam. Is that the appropriate thing to do? Only an occasional spam ever gets to my Inbox which means SpamSieve is filtering them well. I just wish there was some way I didn’t have to check and delete spam which arrives in Google’s or Yahoo’s own Spam or Bulk mailboxes. I guess I will just have to deal with them manually.

Please ignore my previous posting about my special Smart Spam Mailbox. It only served to confuse the issue and really wasn’t relevant to my experiences.

Now for a possible issue/bug. I have noticed that often, maybe 1/3 of the time, when I wake my MacBook Pro from sleep, Mail hangs on the next email check. The dreaded spinning beachball displays and Mail shows not responding in the Force Quit window. Given all the problems Mail has and since I was using a beta version, I just Force Quit and reran Mail then sent a Feedback to Apple. For some reason, once I decided to check the SpamSieve application while the beachball was spinning and when I clicked back to Mail, the spinning beachball stopped and Mail was functioning properly. Now every time I get the spinning beachball, I just click the SpamSieve application and return to Mail and everything seems ok.

So the question I have is this a Mail issue I should report to Apple again or a SpamSieve issue I am reporting to you or a situation between both of them? Next time it happens, I will immediately check the logs and save them and post them here. If there is anything else you would need to check this out, let me know.

I want to thank you for working with me. It is a fairly unique situation these days when a user can dialog with the programmer. That’s why I like GraphicConverter so much. Thorsten Lemke is always answering questions on his forum, offering new betas quickly and just helping people use GraphicConverter. Thanks again for your help.

That’s correct.

No, you should only train as spam messages that got through SpamSieve.

This page explains some options for consolidating all you spam in one place.

Was Mail running before you put your Mac to sleep?

I don’t recall seeing any other reports of that, although there do seem to be lots of issues with Mail in Yosemite hanging. It’s probably a Mail or OS issue, but please send the logs and some sample reports to me via e-mail as well. We may be able to find a way to work around the problem.

I like it as well; it’s a great way to learn from our customers. Thorsten has developed and supported a quality product for the long haul. He’s an inspiration.

Thanks for sending in your logs. The sample from Mail shows that Mail is waiting to hear back from the mail server. It is not doing anything with SpamSieve. This is consistent with the Console log, which says that Mail is getting authentication errors talking to the server, and the fact that you said switching to SpamSieve had no effect.

So I think you are running into a bug in Mail (or perhaps the beta OS). It might help to go to the Accounts preferences in Mail and uncheck “Automatically detect and maintain account settings” for all your accounts. I have seen that cause hangs before.