General problem with ISPConfig mail statistics and quota

Discussion in 'General' started by bolero, Mar 15, 2010.

  1. bolero

    bolero New Member

    Every once in a while when a user reaches the quota limit I see CPU usage skyrocket on these servers and lots of postdrop/sendmail processes trying to do deliveries. I don't know exactly what happens, but it seems, that the attempted delivery (that failed because of over-quota) triggers a statistics message to admispconfig and this needs a temporary file owned by this user written to the postfix queue or so. As this is on /var it fails. And somehow this seems to trigger a whole chain of these messages, basically it creates a loop. Thus you get more and more messages which are undeliverable, but cannot get rejected as they are created on the server.
    So, this seems to be a genuine problem of how thus functionality is implemented.
    First question: is there a way to stop this without stopping the mail statistics?
    Second question: if not, is there a way to stop the statistics only for that user?
    Third question: if I have to switch it off globally, where is this option?

    Here are some examples.

    some top output:
    533 web25p1 20 0 2428 768 764 S 0.0 0.1 0:00.00 sh
    538 web25p1 19 0 6908 1408 1404 S 0.0 0.1 0:00.00 sendmail
    539 web25p1 16 0 6880 1488 1416 S 0.0 0.1 0:00.50 postdrop
    625 web25p1 20 0 2428 848 844 S 0.0 0.1 0:00.00 sh
    628 web25p1 22 0 6908 1408 1404 S 0.0 0.1 0:00.01 sendmail
    635 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 kmpathd/0
    636 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 kmpath_handlerd
    645 web25p1 15 0 6880 1484 1416 S 0.0 0.1 0:00.26 postdrop
    661 root 10 -5 0 0 0 S 0.0 0.0 0:49.19 kjournald
    662 web25p1 20 0 2428 768 764 S 0.0 0.1 0:00.00 sh
    665 web25p1 20 0 6908 1408 1404 S 0.0 0.1 0:00.00 sendmail
    666 web25p1 15 0 6880 1484 1416 S 0.0 0.1 0:00.48 postdrop
    719 web25p1 21 0 2428 772 768 S 0.0 0.1 0:00.00 sh
    722 web25p1 20 0 6908 1408 1404 S 0.0 0.1 0:00.01 sendmail
    723 web25p1 15 0 6880 1484 1416 S 0.0 0.1 0:00.29 postdrop
    782 web25p1 23 0 2428 768 764 S 0.0 0.1 0:00.00 sh
    787 web25p1 22 0 6908 1408 1404 S 0.0 0.1 0:00.00 sendmail
    788 web25p1 15 0 6880 1488 1416 S 0.0 0.1 0:00.30 postdrop
    999 root 15 0 1740 564 484 S 0.0 0.1 3:13.36 syslogd
    1002 root 15 0 1692 384 340 S 0.0 0.0 0:17.34 klogd
    1004 web25p1 20 0 2428 768 764 S 0.0 0.1 0:00.00 sh
    1012 web25p1 20 0 6908 1408 1404 S 0.0 0.1 0:00.00 sendmail
    1013 web25p1 15 0 6880 1488 1416 S 0.0 0.1 0:00.74 postdrop
    1015 root 15 0 2184 444 392 S 0.0 0.0 1:17.83 mcstransd
    1038 root 12 -5 0 0 0 S 0.0 0.0 0:00.00 rpciod/0
    1045 root 15 0 5536 264 236 S 0.0 0.0 0:00.02 rpc.idmapd
    1058 dbus 20 0 13016 536 532 S 0.0 0.1 0:00.00 dbus-daemon
    1079 root 15 0 7092 704 608 S 0.0 0.1 0:00.13 sshd
    1090 web25p1 19 0 2428 844 840 S 0.0 0.1 0:00.00 sh
    1093 web25p1 19 0 6908 1408 1404 S 0.0 0.1 0:00.00 sendmail
    1094 web25p1 15 0 6880 1492 1416 S 0.0 0.1 0:00.67 postdrop

    this happens after upping the quota for web26p4 (shot from above is already after doing so, that's why it's not listed):

    d11:/var/log grep web26p4 maillog|tail
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: 55E9CEC07D: from=<[email protected]>, size=372, nrcpt=1 (queue active)
    Mar 15 12:05:26 d11 postfix/pickup[31021]: 89467EC07E: uid=10306 from=<web26p4>
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: 6EC96EC077: from=<[email protected]>, size=372, nrcpt=1 (queue active)
    Mar 15 12:05:26 d11 postfix/pickup[31021]: 9FDCFEC079: uid=10306 from=<web26p4>
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: 89467EC07E: from=<[email protected]>, size=372, nrcpt=1 (queue active)
    Mar 15 12:05:26 d11 postfix/pickup[31021]: B86A5EC07F: uid=10306 from=<web26p4>
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: 9FDCFEC079: from=<[email protected]>, size=372, nrcpt=1 (queue active)
    Mar 15 12:05:26 d11 postfix/pickup[31021]: D8ED82002C: uid=10306 from=<web26p4>
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: B86A5EC07F: from=<[email protected]>, size=372, nrcpt=1 (queue active)
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: D8ED82002C: from=<[email protected]>, size=373, nrcpt=1 (queue active)

    this finally reveals that this is a message sent to admispconfig:

    d11:/var/log grep D8ED82002C maillog
    Mar 15 12:05:26 d11 postfix/pickup[31021]: D8ED82002C: uid=10306 from=<web26p4>
    Mar 15 12:05:26 d11 postfix/cleanup[26877]: D8ED82002C: message-id=<[email protected]>
    Mar 15 12:05:26 d11 postfix/qmgr[20745]: D8ED82002C: from=<[email protected]>, size=373, nrcpt=1 (queue active)
    Mar 15 12:05:27 d11 postfix/local[30475]: D8ED82002C: to=<[email protected]>, relay=local, delay=428386, delays=428385/0.22/0/0.03, dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail -f-)
    Mar 15 12:05:27 d11 postfix/qmgr[20745]: D8ED82002C: removed

    Related question: is there another way to find out what these messages hanging in sendmail/postdrop actually are than tracking them in the logs? mailq doesn't show them.
     
  2. till

    till Super Moderator Staff Member ISPConfig Developer

    I've never seen this on any other server, so I think it is not a general problem of the statistics implementation. Admispconfig is no real mailbox, it is a statistics script that receives a message for every incoming email. This message contains just the size of the original email in bytes. Take a look at the procmail recipes in the users home directory to see how this works in detail.
     
  3. bolero

    bolero New Member

    I fear you didn't notice this because you didn't have users that stayed over quota for months. I noticed that of the three users that are creating this now two have been over quota since last year. And the monitoring noticed it two days ago when the load average went to 50 for about ten minutes. I had killed all the postdrop processes then. At the moment I have again
    ps ax|grep postdrop|wc -l
    44
    And there is not a problem yet. Just that I have totally unneccessary ressource usage for that and 44 postdrops running again and again. The number of postdrop processes will increase over time and eventually have CPU usage skyrocket and have it bite into swap. I have seen this earlier at least twice on another VM. Both have ispconfig installed. I didn't investigate it on the first one further but now I'm sure it is a direct result of the admispconfig mails and I consider this a bug. These mails should get disabled if an account is over quota, for instance by a daily cronjob.
    Please check this. It should be easily reproducable. Just put one user over quota and send him a mail. This should get you a bounce and create one postdrop and one sendmail process for that tiny mail. And if you send another mail you get another "duo". It then just depends on how much mail the user gets how fast this increases ... If this is a wildcard account with a lot of spam coming in and not monitored by the user it can increase rapidly.
     
    Last edited: Mar 18, 2010
  4. bolero

    bolero New Member

    You didn't answer my questions. How can I stop this? Only by editing the .procmailrc or the mailsize.rc file? Isn't there an option to stop this from the GUI? If I edit the recipe it can get recreated if the user changes his mail settings.
     
  5. till

    till Super Moderator Staff Member ISPConfig Developer

    I dont think so. We are talking here about several ten thousand of ISPConfig 2 installations and you are the only one that reported such an issue. On my own servers I have some users that are over quota for years now and I never had this problem.

    I just tested it on my server and I can not reproduce it.
     
  6. till

    till Super Moderator Staff Member ISPConfig Developer

    You can edit either the procmailrc or the mailsize script. If you want to edit it for all new and updated users, then change the template too in /root/ispconfig/isp/conf/
     
  7. bolero

    bolero New Member

    You cannot reproduce what? You do not get a postdrop and sendmail process for the admispconfig mail if the user is over quota?
     
  8. till

    till Super Moderator Staff Member ISPConfig Developer

    I can not reproduce high resource usage, any kind of recurring emails or any negative effect on the system when a account is over quota.
     
  9. bolero

    bolero New Member

    Ok, but that was not my question. Do you get a postdrop and sendmail process for the admispconfig mail if the user is over quota? If not, I'd be very surprised. If yes, does it keep running infinitely if not killed?
    The "negative effect" comes only when you have enough of these processes on the system, that should be clear.
     
  10. till

    till Super Moderator Staff Member ISPConfig Developer

    Postfix is delivering every email by using postdrop process, it does not matter if the account is over quota or not. So yes, you get such a process for every email delivered on a ispconfig system.

    There are no hanging processes, they get closed immediately as they should. Thats what I told you with no negative effect.
     
  11. till

    till Super Moderator Staff Member ISPConfig Developer

    What you might want to check if there is accidently a quota set for the admispconfig user or group on your system. Use:

    repquota -avug | grep ispconfig
     
  12. bolero

    bolero New Member

    Ok, thanks for that clear answer. What is the OS and postfix version? I'm experiencing this on CentOS 5.4 with stock postfix.
    I don't see that it should get closed immediately as you write.
    Let's walk that thru:
    - user is over quota
    - mail arrives
    - procmail runs the recipe and creates a mail to admispconfig account
    - that mail has to be written to the system by maildrop with owner permissions of that user account
    - however, that user account is over quota
    - the mail cannot be written with "warning: mail_queue_enter: create file maildrop/589934.4099: Disk quota exceeded"
    - but it can also not be bounced as it didn't even enter the queue
    - result: postdrop and sendmail stay open trying to write the file again and again
    - procmail bounces the original mail

    That makes sense and that is what happens here. If that doesn't happen on your systems my question is: where does the mail go? It cannot just fade into thin air. Does admispconfig get it despite the fact that postdrop should not be able to write it? Does it somehow get bounced, to where? Does your postfix have a safeguard against this?
     
  13. bolero

    bolero New Member

    No quota set. The problem is the quota of that account. The mail to admispconfig should get written to the mail queue, but with the over-quota user as the owner I suppose. And this can't be done as this user is over quota. If you cannot see this there must either be a postfix config switch that tells it to stop this kind of hanging process after a while or some kind of automatic safeguard in your postfix. Or it's not trying to write that file under that user's owner.
     
  14. till

    till Super Moderator Staff Member ISPConfig Developer

    The files in the mailqueue are owned by the user postfix and not the owner of the mailbox. You should check if the /var/ partition is full if postfix on your system is not able to store data in its queue anymore. You should also check the mail log for errors.
     
  15. bolero

    bolero New Member

    Obviously not if invoked from a user account. Procmail invokes formail under the user's credentials. This then seems to invoke sendmail and that postdrop and in that chain the user credentials are kept. So, obviously this file then wants to get written by a process owned by that user or with file ownership by that user. I don't see another explanation why it couldn't write. It probably wants to get written in /var/spool/postfix/maildrop.
    I would be happy if you came up with a better explanation. I don't see one at the moment. So, this is a direct and deducable result of the admispconfig mail if the user is over quota.

    No such problem. I repeat: this is a normally working system with no problems. The behavior happens with an account that is over quota. Only. And I can see it on other systems when the user is over quota as well.

    Here is some more information from the log. This is what the incoming message creates (this is one log entry):
    Mar 18 12:59:20 d11 postfix/local[13972]: 407A720059: to=<[email protected]>, orig_to=<[email protected]>, relay=local, delay=391, delays=0.31/0/0/390, dsn=5.2.0, status=bounced (can't create user output file. Command output: procmail: Couldn't create "/var/mail/web26p4" postdrop: warning: mail_queue_enter: create file maildrop/723106.13986: Disk quota exceeded [14000] warn: auto-whitelist: open of auto-whitelist file failed: locker: safe_lock: cannot create tmp lockfile /var/www/web26/user/web26p4/.spamassassin/auto-whitelist.lock.example.com.14000 for /var/www/web26/user/web26p4/.spamassassin/auto-whitelist.lock: Disk quota exceeded procmail: Couldn't create or rename temp file "/var/www/web26/user/web26p4/Maildir/tmp/1268913170.13981_2.example.com" postdrop: warning: mail_queue_enter: create file maildrop/723863.13986: Disk quota exceeded postdrop: warning: mail_queue_enter: create file maildrop/724487.13986: Disk quota exceeded postdrop: warning: mail_queue_enter: create file

    Look at the last "lines". There it repeats the error because it tries and tries and tries. The log entry is just cut there because the buffer is full. 13986 is the postdrop process for the message to admispconfig and it's still there:
    ps waux|grep postdrop
    web26p4 4099 0.0 0.1 6880 1772 ? S 12:22 0:00 /usr/sbin/postdrop -r
    web26p4 13986 0.0 0.1 6880 1776 ? S 12:52 0:00 /usr/sbin/postdrop -r
    root 21992 0.0 0.0 3932 680 pts/0 R+ 13:17 0:00 grep postdrop

    And since that file arrived (half an hour ago) it tried 235 times to write:
    grep "13986" maillog|wc -l
    235
    The only thing you get for these messages is this:
    postfix/postdrop[13986]: warning: mail_queue_enter: create file maildrop/378273.13986: Disk quota exceeded

    That makes it hard to find out what's actually happening as not even a user is attached to this log line and that's why I didn't investigate it deeper the first time I saw it.
     
    Last edited: Mar 18, 2010
  16. till

    till Super Moderator Staff Member ISPConfig Developer

    Maybe on centos. But thats not the case on debian, the postfix queue folders are only writable by the user postfix, so everything stored in the queue is owned by postfix an no any other user.

    Please post the content of the postfix main.cf file.
     
  17. bolero

    bolero New Member

    I don't think that is true. Only the postfix user or the postdrop group can write to it. postdrop *is* definitely running as that user and not as postfix. So, it's probably running as group postdrop (I can't figure out how to show group with ps, but it's safe to assume, otherwise postdrop couldn't write to maildrop.
    Yes, here it is: setgid_group = postdrop

    This is the process tree. As you can see, it starts with user web26p4 and that doesn't get changed. The postdrop is still running as that user. You can also see this in the ps output of a postdrop that I posted earlier.

    sh,4093,web26p4 -c\040wc\040-c\040|\040formail\040-A"X-Loop:\[email protected]"\040\\012\040\040\040\040\040
    └─sendmail,4098 -oi [email protected]omain
    └─postdrop,4099 -r

    I don't think that this can be controlled by main.cf. Or which setting(s) are you specifically looking for? This is a standard config just with some anti-spam tweaks in the various _restrictions.
     
  18. bolero

    bolero New Member

    Try this with a user that is over quota. (I gave him a shell first, of course.)

    su - web26p4
    -sh-3.2$ id
    uid=10306(web26p4) gid=10026(web26) groups=100(users),10026(web26) context=root:system_r:unconfined_t:SystemLow-SystemHigh
    -sh-3.2$ sendmail admispconfig
    -sh: sendmail: command not found
    -sh-3.2$ /usr/sbin/sendmail admispconfig
    postdrop: warning: mail_queue_enter: create file maildrop/941276.21247: Disk quota exceeded
    -sh-3.2$ /usr/sbin/sendmail root
    postdrop: warning: mail_queue_enter: create file maildrop/655744.21474: Disk quota exceeded
    postdrop: warning: mail_queue_enter: create file maildrop/657852.21474: Disk quota exceeded
    postdrop: warning: mail_queue_enter: create file maildrop/659769.21474: Disk quota exceeded

    bang! It keps going until I kill it.

    What do you get on a debian system if you do that?
     
  19. bolero

    bolero New Member

    So, what does your Ubuntu system say if you try that?
     

Share This Page