Should I Be Satisfied with 98.8% Accuracy?

dbrick · September 6, 2016, 9:37pm

I’m delighted with Spam Sieve. And yet I still wonder if I could adjust it and do better. Here’s the Statistics window:

I continue to train SS when emails are mis-categorized. The “Spam Catching Strategy” slider is set to 4/5, but I worry that if I move it more to the right (more aggressive filtering) I’ll get more false positives, which doesn’t seem useful.

OTOH, maybe this is all just my emotional insecurity. After all, SS is doing a great job: I am not twitched or disturbed by spam, just faintly amused. Perhaps I should leave well enough alone.

What do other users think? Michael, have you an opinion?

Michael_Tsai · September 7, 2016, 6:26am

To me, that accuracy seems a bit low. Also, you are getting more false positives than false negatives, which is also not typical. Maybe that’s because you moved the slider. If you send in your log file I can look into why that’s happening.

Michael_Tsai · September 7, 2016, 8:41am

Thanks for sending the log file. It looks like you’ve only trained 8 messages as spam in the last month. So I think the recent accuracy is much higher than the average since April that’s shown in your screenshot. You actually trained more (13) messages as good in that time, so I would recommend moving the slider back to the middle position to make SpamSieve less aggressive.

I also see some changes that I can make to improve the accuracy in the next version of SpamSieve.

dbrick · September 7, 2016, 8:49am

Thanks, I appreciate your comments. I’ll move the slider back to center, and reset the Statistics date so I can see how well SpamSieve is doing currently.

dbrick · December 22, 2016, 9:31pm

Hello again Michael,

After our conversation in September, I moved the slider back to center and have been faithful in training when SS miscategorized incoming mail. Accuracy immediately improved to 99.3%, but has since declined to 88.7%, lower than before.

May I send you the log file again, and ask your advice?

Thanks,

-David

Michael_Tsai · December 23, 2016, 7:15am

Of course. Please see this page.

dbrick · December 23, 2016, 7:45am

Oops. Seeing your answer, I realize I mistyped “88.7%” when I meant “98.7%”.
I apologize. Still lower than before our previous conversation, but not nearly so dramatic.

I will send the log file and a copy of the Statistic window.

Michael_Tsai · December 23, 2016, 11:36am

Thanks. It looks like one of the problem messages was either sent from a previously good address or the address had previously sent you spam but you did not train it as spam. Other than that, I don’t see any obviously wrong. We’re talking about a small number of messages there, so I don’t want to speculate too much. Please let me know how it goes over the next few weeks.

dbrick · December 23, 2016, 12:05pm

Thanks again. I’m glad to hear that nothing appears to be awry.
'll be patient, and see how things progress.

dbrick · March 19, 2017, 10:21am

Here’s an update, Michael. I started anew in late December, with a newly-constituted corpus of fewer than 200 messages. Since then, I’ve continued to train SS when it errs.

The Statistics now report 99.1% accuracy. There are occasional false negatives (one every week or two, say), but I haven’t seen a false positive in several months.

This is very satisfying performance!

Michael_Tsai · March 19, 2017, 10:45am

I’m glad to hear that. Thanks for the follow-up.

dbrick · May 7, 2017, 9:56am

In the six weeks since I last reported, SS accuracy has increased to 99.4%.