Specific spammer is not being cought

I have a spammer who has created a style that seems to pass the Bayesian classifier. I flag it as spam repeatedly and I’m guessing that he changes something just enough to make it look different. Often a different email address, name, or subject. Is there something I can do to make it catch these? (I just noticed that the server junk filter calls it spam)

Below is a copy and paste of the log entry. (1 of 4 for this specific message)

Summary: SpamSieve’s Bayesian classifier predicted this message to be Good based on a statistical analysis of its content.

Score: 4 (0 is least spammy; 100 is most spammy)
Words: ^a-style-padding15px40px(0.001), ^a-style-padding30px20px(0.001), CT:vault(0.002), H:X-Process-Key(0.002), V:Charleston(0.002), V:picnic(0.002), ^a-style-fontsize18pxfontweightboldcolorfffffftex(0.002), ramen(0.002), V:Mount(0.005), V:blanket(0.005), V:broth(0.005), tonkotsu(0.005), care!(0.071), U:events(0.074), A:Meat(0.110)
Accuracy: Correct (if this message is not Good, you should train it as Spam)
Help: Correct All the Mistakes
Superseded Prediction: SpamSieve classified this message again later.
Date Logged: Today at 10:22:37 AM

Subject:	(HP) Membership Update: Thanks for stopping by Costco recently
From:	Perk C0STC0 SpeciaI <perkcstcspeciai@freshtodays.com>
Date Sent:	Today at 8:55:47 AM
Date Received:	Today at 8:56:06 AM
To:	cheryl@bakerstreetinn.com
Size:	16 KB (6 KB compressed)
Identifier:	YguJ3c3QLOLXT/bxMFsSWg==
Server Filter:	The server junk filter classified this message as spam.
Origin:	Unknown

Contacts:	297
Excluded Contacts:	1
Good Messages:	14,418
Spam Messages:	14,603
Bias:	0.000
P(spam):	0.000[0.000]
Tokenizing Time:	0.011s

Processing Time:	0.282s
SpamSieve:	3.3 at /Applications/SpamSieve.app
Device:	macOS 15.7.4 (24G517) on Cheryl’s MacBook Pro (MacBookPro15,1)
User:	cherylmhockaday (Cheryl M Hockaday)
Language:	English

If you look at the Subject you can see that it begins with “(HP).” Because my server (Host Papa) has such a lousy spam filter, I have set it to place that prefix in the subject line of any that is calls spam and not move them out of the Inbox so I can try to clear any false positives.

There are a bunch of words in this message with very low spam probabilities:

Words: ^a-style-padding15px40px(0.001), ^a-style-padding30px20px(0.001), CT:vault(0.002), H:X-Process-Key(0.002), V:Charleston(0.002), V:picnic(0.002), ^a-style-fontsize18pxfontweightboldcolorfffffftex(0.002), ramen(0.002), V:Mount(0.005), V:blanket(0.005), V:broth(0.005), tonkotsu(0.005), care!(0.071), U:events(0.074), A:Meat(0.110)

This could indicate that there are other spam messages containing these words that were incorrectly trained as good. I recommend looking for them as described here.

Thank you.

As I look into it, I find that the third one in the list is in 2 false negative spam messages and also in a good message. Should I train those two as spam based on that with a good message in the mix?

I’m sorry, but I don’t know what “list” refers to here.

I was referring to the list of words. And I was mistaken. I meant the second, not the third word. In the snipit above it is “^a-style-padding30px20px”

Are you searching the Good Messages section of the Corpus window? Any spam messages that are listed as good should be trained as spam. The good ones should not be.

No, I was searching in the Log. I’ll look in the courpus. Thank you.

It’s probably worth searching the log, too, as there may be messages there that are not in the corpus.

Looking at the “words” in the log for a message, I see that many of the words have prefixes like “F:" “CT:” and “V:”

would you explain what they mean? Or maybe I’m going too deep in the woods for a user :rofl:

The prefixes have to do with where in the message the words were found. F means the From header, CT means the Content-Type header, and V means body text that may be invisible.