Comments on Learning Spam With SpamAssassin And All Of Your ISPConfig Clients [ISPConfig 3]

Learning Spam With SpamAssassin And All Of Your ISPConfig Clients [ISPConfig 3] This is a quick way of learning spam from all of your ISPConfig clients by running a quick and simple command. Please note that this is for ISPConfig 3, not 2.

16 Comment(s)

Add comment

Comments

By:

What happens after running the sa-learn command from the command line? Does SpamAssassin continue monitoring the folders into the future? Rather, does the command have to be kicked off periodically to continue learning? If the answer is the later, this would likely be included in a cron job correct?

By: Adan0s

you need to put it into a cronjob to let it automatically process the spam/ham

By:

How should we setup this cronjob?

By: wjk940

The cron job already exists, at least if you followed the steps in the Perfect Server how to forge series, at /usr/sbin/amavisd-new-cronjob.

By: admins

Its only for small mailservers.
If you've a large mailserver sa-learn say its a too long command, bye..

admins

By:

Simply remove the final * from all commands. If you use maildir, it is sufficient to give the directory name and sa_learn will investigate all mails in the directory.

By:

Thank for this short guide, I have translated in Germany. Here you can see

http://www.howtoforge.de/uncategorized/ispconfig-3-clients-lernen-spam-mit-spamassassin/

. Best ThanksPlaNet Fox

By: vincenzo Ingrosso

Hy,

you have missing in /bin/sa_learn the Maildir so script change to:

#!/bin/bash
/usr/bin/sa-learn --spam /var/vmail/*/*/*/.Junk/*/*
/usr/bin/sa-learn --ham /var/vmail/*/*/*/cur/*

Thank you for your work!

By:

Is it bad to run this script if a lot of the emails in the spam folder is already marked as spam by spamassasin (***SPAM*** in the title) ? Or doesn't it matter?

By: Anonymous

For the learned bayes tokens to actually be used while amavis calls spamassassin, the following line has to go in /etc/spamassassin/local.cf:

bayes_path /var/lib/amavis/.spamassassin/bayes

By default, the learned tokens go to ~/.spamassassin/ of the user under which sa-learn is run where it will never be read (since virtual mailboxes are used). Instead, all tokens have to go under the home directory of the amavis user which is /var/lib/amavis.

To verify that the correct directory is used by spamassassin, execute:

spamassassin -D -t < /usr/share/doc/spamassassin/examples/sample-spam.txt 2>&1 | egrep '(bayes:|whitelist:|AWL)'

You should see these lines:
[...]
dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks
dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen
[...]

By: justStartn

based on the suggestions by others in this thread,

this seems to be working for me:

to set the common ie server training tokens folder:

vi /etc/spamassassin/local.cf

bayes_path /var/lib/amavis/.spamassassin/bayes

To verify that the correct directory is used by spamassassin, execute:

spamassassin -D -t < /usr/share/doc/spamassassin/examples/sample-spam.txt 2>&1 | egrep '(bayes:|whitelist:|AWL)'

You should see these lines:

dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks

dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen

show bayes sa status:

sa-learn --dump magic

training:

/usr/bin/sa-learn --spam /var/vmail/*/*/*/.Junk/*/*

/usr/bin/sa-learn --ham /var/vmail/*/*/*/cur/*

better training ham before spam as we might have picked up spam in our ham and it will unleard the ham during the spam:

/usr/bin/find /var/vmail/*/*/Maildir -maxdepth 1 -not -ipath '*/*Junk*' -not -ipath '*/*Trash*' -not -ipath '*/Maildir' -not -ipath '*/*spam*' -type d -exec /usr/bin/sa-learn --ham {} \;

/usr/bin/find /var/vmail/*/*/Maildir/ -type d \( -iname "*Junk*" -o -iname "*spam*" \) -exec /usr/bin/sa-learn --spam {} \;

crontab for ham running once as it takes over an hour and spam runs ever 2 hours:

# this is how we train spamassassin spam vs ham

22 2 * * * /usr/bin/find /var/vmail/*/*/Maildir -maxdepth 1 -not -ipath '*/*Junk*' -not -ipath '*/*Trash*' -not -ipath '*/Maildir' -not -ipath '*/*spam*' -type d -exec /usr/bin/sa-learn --ham {} \; 2>&1 > /dev/null

02 */2 * * * /usr/bin/find /var/vmail/*/*/Maildir/ -type d \( -iname "*Junk*" -o -iname "*spam*" \) -exec /usr/bin/sa-learn --spam {} \; 2>&1 > /dev/null

59 5 * * * /usr/bin/sa-learn --dump magic

By: wjk940

amavisd-new brings a cron job (/etc/cron.d/amavisd-new), which runs /usr/sbin/amavisd-new-cronjob every 3 hours as user 'amavis'. Changing the sa-sync and sa-clean actions invoke sa-learn (with "--spam /var/vmail/*/*/Maildir/.Junk/*/*", "--ham /var/vmail/*/*/Maildir/cur/*") is simple. However, /var/vmail is vmail:vmail 700.

What is the best way to integrate:

1) run sa-learn as vmail instead of amavis?

2) run sa-learn as root instead of amavis?

3) figure out how the make /var/vmail vmail:amavis 750 for all adds/changes done via ISPConfig?

By: wjk940

After digging into sa-learn in a amavisd-new, spamassassin, postfix, dovecot system, I decided to move the discussion to ISPConfig 3->Installation/Configuration communitity discussion [1].

[1] https://www.howtoforge.com/community/threads/sa-learn-how-to-resolve-permission-issue.73056/

By: Bernard

my solution for the ham learning process:find /var/vmail/*/* -type d -not -path "*.Spam*" -not -path "*.Junk*" -not -path "*.Trash*" -not -path "*new*" -not -path "*tmp*" -not -path "*.Sent*" -not -path "*.Archive*" -not -path "*Maildir/cur*" -not -path "*dovecot*" -not -path "*sieve*" -not -path "*quotausage*" -not -path "*courier*" -type d -exec /usr/bin/sa-learn --ham {} \;

By: Flash

For centos 7 the paths are as follows:spamassassin local.cf: /etc/mail/spamassassin/local.cf

bayes db location: /var/spool/amavisd/.spamassassin/bayes

so update the file /etc/mail/spamassassin/local.cf, add "bayes_path /var/spool/amavisd/.spamassassin/bayes"

By: Rado Hrabcak

Hi there,

I'm trying to figure out if there is a way for ISPConfig 3 + SpamAssassin to isolate bayesian filters at the domain level. What I mean by that?

I have a customer that is pretty good at sorting false negative (spam in Inbox) and false positive (legit mail in Spam folder) but say I have 10 other customers who are not doing a good job there with sorting those false positives/negatives.

Now, if I run sa-learn I assume that if I learn bayesian filters on legit spam that is in Spam folder caught by one good customer then if I run sa-learn --ham through inboxes of other 10 "bad" customers it will simply overweight what it learned from the good one.

Would there be a way for per customer (domain) or even per user isolation of this?

Thanks!