HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials

HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials (http://www.howtoforge.com/forums/index.php)
-   Installation/Configuration (http://www.howtoforge.com/forums/forumdisplay.php?f=27)
-   -   sa-learn seems not to work (http://www.howtoforge.com/forums/showthread.php?t=52421)

viniciusmassuchetto 23rd April 2011 16:06

sa-learn seems not to work
 
I'm using ISPConfig 3 on Debian Lenny.

I have a lot of messages separated in a "global Junk" mail folder. Things seems to go well when I run:

Code:

sa-learn --spam --dir .Junk/cur/
But even learning tons of messages that are supposed to be SPAM, the day after they get to our servers again without being marked, as nothing has been learned before.

Also, the configuration on /etc/amavis/conf.d/50-user differs from the ones in the ISPConfig Panel. For example, I see by the amavis logs that "$sa_tag_level_deflt" and "$spam_quarantine_to" are completely ignored in that file, and that ISPConfig uses the values in its "spamfilter_policy" database table.

Not sure if things are related, but like that I can't figure out where ISPConfig tells amavis + sa to get the learned rules from.

Many Thanks

viniciusmassuchetto 24th April 2011 01:14

Maybe I was doing it wrong. I was actually creating the bayes database into /root/.spamassassin/ folder. As I'm with amavis integrated, the right folder seems to be /var/lib/amavis/.spamassassin.

So I used the --dbpath option pointing to this folder in sa-learn, and it seemed to increase de database, as the bayes_toks file increased almost 2MB.

After this, when I went to ISPConfig Panel and added some spamfilter rules in black/whitelist. Then the size of the bayes_toks file in the amavis folder just went back to the size it was before I ran the sa-learn on them, as I could see by the modification time.

After all... what's the right way of learning spam with ISPConfig?

cbj4074 20th August 2012 20:18

I have the exact same question:

How is one supposed to train SpamAssassin, manually, using the "sa-learn" executable when using Dovecot + Amavis + SpamAssassin + ISPConfig?

The original poster's attempts to flag spam were failing because he was executing the "sa-learn" executable as the "root" user, so the Bayesian tokens were not being added to the effective user's (amavis's) database. (The tokens were being added to the "root" user's database.)

Quote:

Not sure if things are related, but like that I can't figure out where ISPConfig tells amavis + sa to get the learned rules from.
Quote:

As I'm with amavis integrated, the right folder seems to be /var/lib/amavis/.spamassassin
I have found this to be the case as well. How? By discovering that SpamAssassin's "bayes_path" directive is not defined anywhere on the system in question, and the relevant source code indicates that the default value is ~/.spamassassin/bayes, which should translate to /var/lib/amavis/.spamassassin in the normal course of events.

I tried the following:

Code:

# su amavis -c 'sa-learn --spam "/var/vmail/example.com/sa-training/Maildir/.INBOX.Spam"'

archive-iterator: no access to /var/vmail/example.com/user/Maildir/cur: 13 at /usr/share/perl5/Mail/SpamAssassin/ArchiveIterator.pm line 539.
archive-iterator: no access to /var/vmail/example.com/user/Maildir/cur: 13 at /usr/share/perl5/Mail/SpamAssassin/ArchiveIterator.pm line 771.
archive-iterator: unable to open /var/vmail/example.com/user/Maildir/cur: 13

This does not work because the permissions on each user's "mail directory" (e.g., Maildir) are 700, with vmail:vmail ownership. Adding the "amavis" user to the "vmail" group will not solve the problem, due to the 700 permissions.

Am I missing something obvious?

Thank you.

UPDATE:

Indeed I was missing something "obvious".

The solution is to include the --username switch to the 'sa-learn' executable, e.g.:

Code:

# sa-learn --username=amavis --spam /var/vmail/example.com/trainer/Maildir/.Spam/cur
This enables one to execute the command as "root" or "vmail", which provides for the necessary permissions, while at the same time adding the tokens to the "amavis" user's database.

cbj4074 20th August 2012 20:41

Also, I am curious to know if ISPConfig configures Amavis to maintain a separate Bayes token database for each virtual mail user (e.g., within a database).

Or, does ISPConfig configure Amavis to use a single Bayes database, e.g., that in /var/lib/amavis/.spamassassin?

till 21st August 2012 09:37

A single bayes database is used as far as I know. There is no special configuration in ISPConfig about this, so the defaults of the Linux distribution were you installed the system on are used.

cbj4074 21st August 2012 22:00

Thanks, Till!

Do you happen to know whether or not it is necessary to restart Amavis for changes to the Bayes database to be effective?

I realize that SpamAssassin is accessed on-demand when used with Amavis, but it's not clear whether the Bayes values are loaded once when Amavis is started, or whether a look-up is performed against whatever data exists in the Bayes database with Amavis's every request to SpamAssassin.

Thanks again.

cbj4074 27th August 2012 16:47

Quote:

Originally Posted by cbj4074 (Post 283973)
UPDATE:

Indeed I was missing something "obvious".

The solution is to include the --username switch to the 'sa-learn' executable, e.g.:

Code:

# sa-learn --username=amavis --spam /var/vmail/example.com/trainer/Maildir/.Spam/cur
This enables one to execute the command as "root" or "vmail", which provides for the necessary permissions, while at the same time adding the tokens to the "amavis" user's database.

Actually, this was not the solution; I was mistaken.

Users on the SpamAssassin mailing list pointed-out that the --username switch is intended for use with virtual user configurations, e.g., those tied to a SQL database of some kind. (It's worth noting that using an invalid --username doesn't throw a warning or error, and seems to use the current username instead.)

The solution was to "hard-code" the SpamAssassin Bayes database location in the configuration file (typically /etc/spamassassin/local.cf on Debian/Ubuntu systems):

Code:

bayes_path /var/lib/amavis/.spamassassin/bayes
With this directive in-place, the sa-learn command will always use the specified database (unless the --username argument is provided [and is valid]).

To ensure that the correct database is being used:

Code:

# spamassassin -D -t < /usr/share/doc/spamassassin/examples/sample-spam.txt 2>&1 | egrep '(bayes:|whitelist:|AWL)'

[...]
dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks
dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen
[...]

The directive is having the intended effect; even though the command is executed as "root", the Amavis user's database file is used.

Now, when training SpamAssassin, the sa-train executable can be called as the "root" user, which allows for access to the mailboxes in /var/vmail while at the same time populating the correct Bayes database (Amavis's).

mattltm 15th October 2013 13:43

Is there a way to check that bayes is actually working for incoming email?

I have over 55000 example emails in my database but still get the same type of spam through.

cbj4074 21st April 2014 17:04

Yes, there is.

My recommendation is to read-through the relevant bits of the thread at http://www.gossamer-threads.com/list...t=spamassassin .

The thread is long, but very worthwhile when it comes to understanding how SpamAssassin, Bayes, and AMaViS function together.

Note in particular the link cited in the first post of the above thread; that also contains a wealth of relevant and useful information where this problem is concerned.

If those resources don't lead you to the answer, let me know...

mattltm 22nd April 2014 21:31

I've taken a look through and it seems that the X-SPAM header is not getting added to any of my incoming emails.

How do I add them?


All times are GMT +2. The time now is 20:10.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.