whitelist and corpus questions

humanengr · February 27, 2007, 11:48pm

Am I correct to infer that Spamsieve scores each msg that matches a whitelist rule as a “1”?

Does Spamsieve re-balance the corpus using new msgs that match the whitelist? Or does it only update the corpus with msgs not matching the whitelist but which it considers “good”?

I started using Spamsieve a week ago with 1000 msgs (65% spam). The statistics are now:

Filtered Mail
154 Good Messages
5,487 Spam Messages (97%)
764 Spam Messages Per Day

SpamSieve Accuracy
5 False Positives
16 False Negatives (76%)
99.6% Correct

Corpus
401 Good Messages
936 Spam Messages (70%)
56,388 Total Words

Rules
2,727 Blocklist Rules
866 Whitelist Rules

Showing Statistics Since
2/20/07 4:41 PM

Is 70% still within the balanced range or is this an indication of a problem?

Michael_Tsai · February 28, 2007, 6:55am

Yes.

Yes, depending on the current corpus composition and on whether it thinks the messages are interesting.

70% is fine. If you simply correct any mistakes that it makes, SpamSieve will automatically keep the corpus relatively balanced.