C-Command Software Forum

Should I retrain SpamSieve and reset the Corpus?

Been using it flawlessly for the past several years…thanks SpamSieve. But now I’m starting to get some junk mail that is not being filtered correctly.

Some of those mails come from exactly the same address and even though I’ve Trained them as Spam numerous times, they still get through.

I’ve got several email accounts and some of those same spam emails get through on some of the accounts and on the other mail accounts they are correctly filtered as spam by spamSieve.

I have even went to Blocklist and added those server addresses, but they still get through.

All these emails are blatant spams, no way you can even doubt they might be legit.

Otherwise SpamSieve is doing a great job in selecting all other spam…it’s just these pesky spams from two or three different places that get through all time.

The following is a copy of my statistics:

Filtered Mail
69,060 Good Messages
55,092 Spam Messages (44%)
31 Spam Messages Per Day

SpamSieve Accuracy
67 False Positives
633 False Negatives (90%)
99.4% Correct

Corpus
3,727 Good Messages
7,252 Spam Messages (66%)
628,065 Total Words

Rules
3,625 Blocklist Rules
11,530 Whitelist Rules

Showing Statistics Since
25/8/2008 4:39 PM

Thanks for any help.

JC

That’s usually a sign that the messages aren’t getting through SpamSieve; rather something is set up such that it isn’t even being asked to examine those messages. I suggest checking the setup and/or seeing what the log says about those messages.

Upon searching the log and prying around, I’ve noticed that the same spam mails are being correctly identified as such by SpamSieve in one email account and on another mail account the exact same spam mails are are getting through (these spams are targeting these two mail accounts). So maybe the problem is with that particular email account…I’ll look into that as a possible cause.

Do you mean that for the other account the log says “Predicted: Good” for those messages?

Well unfortunately I can’t tell for sure, but it seems to be that way. What happens is that I’m training as spam all those that get through and that seems to alter the “Predicted” parameter on the log. What I need to to then is to check the log before I perform any action on the incoming mails.

This is an extract of the log on one of the same spam mails, in which you can see that in the first block the spam has been correctly identified as such when it came into one of my my accounts (the one that correctly catches spam):

=====================================================================
Predicted: Spam (96)
Subject: TodayScience: Shocking news for obese-people
From: IErSvrh630@3dbulb.com
Identifier: GszwybvHeyPVRN1/zbEqYQ==
Reason: P(spam)=1.000[1.000], bias=0.513, H:x-mailerid(1.000), XM:JavaMailer(1.000), x-mailerid:3(1.000), chlorogenic(1.000), -coffee(1.000), green-coffee(1.000), R:^mundo-r^com(1.000), 12-17(1.000), absolutelly(1.000), F:Science(1.000), PT:Science(1.000), ^a-style-fontsize10pxfontfamilyVerdanaArialHelvet(1.000), S:Shocking(0.999), 1669(0.999), ^a-class-bmefooter(0.999)
Date: 2013-06-14 08:20:49 -0500

…and on this second block of the same log, you can see I trained the spam mail that came into my other account and went through (the one that allows this spam to get through):

=====================================================================
Trained: Spam (Manual)
Subject: TodayScience: Shocking news for obese-people
Identifier: AYUUwfAmWBJzKFbO6RCvfw==
Actions: added rule <From (address) Is Equal to “AdaRARXt998@appleads.in”> to SpamSieve blocklist, added rule <From (name) Is Equal to “TodayScience: News”> to SpamSieve blocklist, added to Spam corpus (7257)
Date: 2013-06-14 08:23:45 -0500

No, SpamSieve never goes back to alter the log. If it says “Predicted: Spam” that means that’s what SpamSieve originally thought (for that message in that account; note that the identifiers are different). If there’s no “Predicted: Good” for that message in the other account, that means your mail program was not set up to ask SpamSieve to examine that other message.

You are right…below is a copy of the log from the last time I downloaded mail. The third one from the top (“Subject: TodayScience…”) came to one of my email accounts and was correctly flagged as spam. My other email account, the one in which spam is going through, received the exact same message and another spam one (both I have repeatedly Trained as Spam) and neither one shows up in the log. Now, I just went back to mail and trained both messages as spam and they now show as “Trained: Spam (Manual)”…please read the second log at the end.

I guess this means that SpamSieve is not checking this account at all…it mus be some setting in Mail app. What should I do next?

First log just after downloading my mail and not touching anything:

=====================================================================
Predicted: Good (1)
Subject: [Yahoo freedive-list] RE: [NorCalSkinDivers] Bleak Ab reccomendations
From: pescaman@sbcglobal.net
Identifier: a7pI7zADMqGxNx1Uof2KMA==
Reason: (
"pescaman@sbcglobal.net"
) matched rule <From (address) Is Equal to "pescaman@sbcglobal.net"> in SpamSieve whitelist
Date: 2013-06-14 14:07:30 -0500

Predicted: Good (1)
Subject: Reply to thread ‘Should I retrain SpamSieve and reset the Corpus?’
From: forums@c-command.com
Identifier: IUnk8aLTlJ+ldiovJMjB5g==
Reason: (
"forums@c-command.com"
) matched rule <From (address) Is Equal to "forums@c-command.com"> in SpamSieve whitelist
Date: 2013-06-14 14:07:30 -0500

Predicted: Spam (99)
Subject: TodayScience: Shocking news for obese-people
From: mhiMF63@aecoservices.com
Identifier: r2n/lzm3AtEURbhj4TiiPA==
Reason: (
“TodayScience: News”
) matched rule <From (name) Is Equal to “TodayScience: News”> in SpamSieve blocklist
Date: 2013-06-14 14:07:31 -0500

Predicted: Good (1)
Subject: Aviso Especial
From: tsunami@dhn.mil.pe
Identifier: TGUR31GKbC+gHMbw7RYqrQ==
Reason: (
“tsunami@dhn.mil.pe”
) matched rule <From (address) Is Equal to “tsunami@dhn.mil.pe”> in SpamSieve whitelist
Date: 2013-06-14 14:07:31 -0500

Predicted: Good (1)
Subject: Mako Shark Jumps Into Fishermen’s Boat
From: BoatingWorldMagazine@eloyaltymail.com
Identifier: R9PRjL/vfi/ITf8uQzViKw==
Reason: (
"BoatingWorldMagazine@eloyaltymail.com"
) matched rule <From (address) Is Equal to "BoatingWorldMagazine@eloyaltymail.com"> in SpamSieve whitelist
Date: 2013-06-14 14:07:31 -0500

Trained: Good (Auto)
Subject: Mako Shark Jumps Into Fishermen’s Boat
Identifier: R9PRjL/vfi/ITf8uQzViKw==
Actions: added to Good corpus (3749)
Date: 2013-06-14 14:07:31 -0500

Predicted: Good (1)
Subject: inserting custom google map
From: jmakowski208@gmail.com
Identifier: XdSBhCthUm8E5H40IPuyuA==
Reason: (
“<fmpexperts-ironclad.net.au>”
) matched rule <List-ID Is Equal to “<fmpexperts-ironclad.net.au>”> in SpamSieve whitelist
Date: 2013-06-14 14:07:31 -0500

Second log after training both messages as spam:

=====================================================================
Trained: Spam (Manual)
Subject: Nuevas Conferencias en Peru
Identifier: 6DXF2W5KBuCobizKTXl5ZA==
Actions: added rule <From (name) Is Equal to “Ing. Pedro López Castro”> to SpamSieve blocklist, added to Spam corpus (7258)
Date: 2013-06-14 14:21:04 -0500

Trained: Spam (Manual)
Subject: TodayScience: Shocking news for obese-people
Identifier: VR6IAhOnKU0GbVqKCvHqhw==
Actions: added rule <From (address) Is Equal to "YdFosxn699@allentime.com"> to SpamSieve blocklist, added to Spam corpus (7259)
Date: 2013-06-14 14:21:11 -0500

Agreed. Please make sure that your “SpamSieve” rule in Mail’s preferences](http://c-command.com/spamsieve/manual-ah/setting-up-apple-mail) is set to “Every Message” and that it’s at the top of the list.

Well, all seems to be working properly now.

  • Moved the SpamSieve rule in Mail to the top (it was in the middle of the list).
  • Changed that same rule from Message “Type > Mail” to “Every Message”.
  • Eliminated a rule to detect spam for that specific account (I added this one recently when the spam problem started, the two above had been like that since day one).

Even though I had already checked all settings for something unusual or something “new”, it never occurred me to fiddle with these ones because this is the way SpamSieve had been set and working from day one.

This morning one of those pesky spam’s came in and it was immediately flagged as such by SpamSieve. Will keep monitoring just in case.

I really appreciate all the time and effort you put into helping me.

Thanks again.
JC