PDA

View Full Version : Postfix Dies Frequently.


chimaster
14th November 2007, 21:56
Hello everyone, I've been trying to work out why my smtpd keeps dieing almost every 15 minutes! I'm either having to restart it manually or wait 20 minutes or so and it comes back to life.

I'm really not getting much in my logs to diagnose, so any advise is most welcome. All I can really find in my mail.err is the following... Other than that everything looks fine. I'm running suse 10.0, followed falkos excellent perfect setup guide with Ispconfig.


--mail.err--

Nov 14 15:15:09 charlotte amavis[13035]: (13035-10) TROUBLE in process_request: Error writing a SMTP response to the socket: Broken pipe at (eval 53) line 813, <GEN75> line 397.
Nov 14 16:15:08 charlotte amavis[18258]: (18258-05) TROUBLE in process_request: Error writing a SMTP response to the socket: Broken pipe at (eval 53) line 813, <GEN34> line 804.
Nov 14 17:15:03 charlotte amavis[22348]: (22348-10) TROUBLE in process_request: Error writing a SMTP response to the socket: Broken pipe at (eval 53) line 813, <GEN73> line 54.

----


Thanks in advance.

falko
15th November 2007, 17:03
Are there no other errors in any of the logs?

chimaster
15th November 2007, 22:20
Hey Falko,

It's a funny thing, I'm not seeing any other errors. I just resolved
Mail.err
WARN: all primary virus scanners failed, considering backups

Which was just a case of uncommenting

### http://www.clamav.net/ - backs up clamd or Mail::ClamAV
['ClamAV-clamscan', 'clamscan',
"--stdout --disable-summary -r --tempdir=$TEMPBASE {}", [0], [1],
qr/^.*?: (?!Infected Archive)(.*) FOUND$/ ],

within /etc/amavisd.conf

There is nothing that shows up as being an error as such. The postfix daemon seems to timeout doing something, die and then restart 5 Minutes later. Previously it was only about 4 or 5 times a day, now its gone up to almost every 20 - 30 minutes.

The only things that have changed (as far as I know)

1. I doubled the RAM in my system. (now 2GB)
2. I added some RBLS based on the Howtoforge Postfix Spam fighting doco. (however, I was having issues prior to this and it was part of my troubleshooting procesS)
3. I've added Queue trees to my mikrotik firewall.. Which may be an issue, but..
a. There are no queued bytes on the SMTP_Queue (or any queue).
b. It's passing traffic fine.
c. There are only ever 5 - 25 concurrent connections to my servers SMTP port.

As a final note the amavis socket error seems to be once or twice a day only.

Thanks in advance.

chimaster
15th November 2007, 23:00
We'll see how we go, but I just got sending using auth connections going again by commenting out

#unknown_relay_recipient_reject_code = 554
#unknown_sender_reject_code = 554

in /etc/postfix/main.cf

I don't think this was related to issues I was having previously with postfix dieing, however it would explain my monitoring system freaking out. :-)

Any comments on the following main.cf appreciated. Bearing in mind that I followed the Perfect Setup for SuSE 10. and then adding the RBL stuff from the postfix article. All my users send out with SMTP_AUTH setup via ispconfig. I also provide a "open" server to my wireless network for casual users.

readme_directory = /usr/share/doc/packages/postfix/README_FILES
inet_protocols = all
biff = no
mail_spool_directory = /var/mail
canonical_maps = hash:/etc/postfix/canonical
#virtual_maps = hash:/etc/postfix/virtual
relocated_maps = hash:/etc/postfix/relocated
transport_maps = hash:/etc/postfix/transport
sender_canonical_maps = hash:/etc/postfix/sender_canonical
masquerade_exceptions = root
masquerade_classes = envelope_sender, header_sender, header_recipient
myhostname = charlotte.$mydomain
program_directory = /usr/lib/postfix
inet_interfaces = all
masquerade_domains =
#mydestination = $myhostname, localhost.$mydomain
defer_transports =
disable_dns_lookups = no
#relayhost = 127.0.0.7
mailbox_command =
mailbox_transport =
strict_8bitmime = no
disable_mime_output_conversion = no
smtpd_sender_restrictions = hash:/etc/postfix/access
smtpd_client_restrictions =
smtpd_helo_required = yes
disable_vrfy_command = yes
smtpd_helo_restrictions =
strict_rfc821_envelopes = yes
invalid_hostname_reject_code = 554
multi_recipient_bounce_reject_code = 554
non_fqdn_reject_code = 554
relay_domains_reject_code = 554
unknown_address_reject_code = 554
unknown_client_reject_code = 554
unknown_hostname_reject_code = 554
unknown_local_recipient_reject_code = 554
#unknown_relay_recipient_reject_code = 554
#unknown_sender_reject_code = 554
unknown_virtual_alias_reject_code = 554
unknown_virtual_mailbox_reject_code = 554
unverified_recipient_reject_code = 554
unverified_sender_reject_code = 554
smtpd_recipient_restrictions =
reject_invalid_hostname,
reject_unknown_recipient_domain,
reject_unauth_pipelining,
permit_mynetworks,
permit_sasl_authenticated,
reject_unauth_destination,
# reject_rbl_client multi.uribl.com,
# reject_rbl_client dsn.rfc-ignorant.org,
reject_rbl_client dul.dnsbl.sorbs.net,
reject_rbl_client sbl-xbl.spamhaus.org,
reject_rbl_client bl.spamcop.net,
# reject_rbl_client dnsbl.sorbs.net,
# reject_rbl_client cbl.abuseat.org,
# reject_rbl_client ix.dnsbl.manitu.net,
# reject_rbl_client combined.rbl.msrbl.net,
# reject_rbl_client rabl.nuclearelephant.com,
permit
#smtpd_recipient_restrictions = permit_sasl_authenticated,permit_mynetworks,reject _unauth_destination
smtp_sasl_auth_enable = yes
smtpd_sasl_auth_enable = yes
smtpd_use_tls = yes
smtp_use_tls = yes
alias_maps = hash:/etc/aliases
mailbox_size_limit = 0
message_size_limit = 10240000
smtp_sasl_security_options =
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
mydomain = queenstownhq.co.nz
smtpd_sasl_local_domain =
smtpd_sasl_security_options = noanonymous
broken_sasl_auth_clients = yes
smtpd_tls_auth_only = no
smtp_tls_note_starttls_offer = yes
smtpd_tls_key_file = /etc/postfix/ssl/smtpd.key
smtpd_tls_cert_file = /etc/postfix/ssl/smtpd.crt
smtpd_tls_CAfile = /etc/postfix/ssl/cacert.pem
smtpd_tls_loglevel = 1
smtpd_tls_received_header = yes
smtpd_tls_session_cache_timeout = 3600s
tls_random_surce = dev:/dev/urandom

virtual_maps = hash:/etc/postfix/virtusertable

mydestination = /etc/postfix/local-host-names


Thanks!!

falko
16th November 2007, 18:45
Your main.cf looks ok, as far as I can see.

chimaster
16th November 2007, 23:21
Hey again Falko,

I think there are some issues with mail coming through due to 554 responses from the unknowns. i.e. clients with internal IP addresses trying to send emails being blocked, but I'm going to read more into the unknown_(s) and check everything on my side.

My postfix is still not responding on occasion. But nothing in the logs. I'll read a bit more and post back when I find a result. In the meantime it's running. :-) and that's a good thing.

chimaster
22nd November 2007, 03:23
I thought I had it cracked, but problems persist.

If I restart postfix it will handle mail for a short while and then simply stop processing.

My mail log will show postfix doing the business for a bit and then simply stop processing (i.e. daemon has died) and only show pop3d logins. Then if I restart postfix it will start processing emails again for a short while and die.

My customers are on the phone quite frequently with me now missing emails or not receiving for a period of time, or simply unable to send until I do a restart and then they can send for 5 minutes...

I'm really not sure how to progress with this one.... Any ideas? Let me know what is needed to troubleshoot it further.

---Update----

A quick update:- The Thick Plottens.

I'm getting lots of timeout errors when trying to connect from this server to the real world.
mailq.
C2506D9192 1185 Thu Nov 22 15:21:31 spike@example.com
(connect to gsmtp163.google.com[64.233.163.27]: Connection timed out)
bob@example.com

Also if I telnet port 25 to the DNS or IP address of this server I get the same issue. however if I telnet from my PC or another server (running a similar system, debian etch instead of suse 10) I have no trouble at all. If I telnet into my server on the same network I also have timeouts and delays.

It appears to be something with port 25 connectivity. I'm at a loss as to what.
Also, if I telnet to 127.0.0.1 25 I have timeouts and don't get the welcome banner.

If I restart postfix, I get an immediate response, but same issues as above apply, this only last for a short time.

Looking forward to (ANY) suggestions.

:-)

bschultz
22nd November 2007, 07:21
I realize that this doesn't solve the problem...but it will buy you some time to track down what's causing the problem.

You could setup a cron job to restart Postfix every five minutes (or whatever time frame you want...I use 30 minutes). It only takes a few seconds to restart Postfix, and it should keep mail flowing until you can track down the cause.


#!/bin/sh
set -e

/etc/init.d/postfix restart
/etc/init.d/amavis restart


I'm doing the same thing trying to figure out why I'm having problems with Amavis.

Brian

chimaster
22nd November 2007, 08:28
That was the short term solution.. although now the it's all gone pear shaped.

at this stage I'd need to restart postfix every 2 minutes, and mail is now not going out at all due to time outs. Still pretty stuck on what it is, I'm trying to capture everything going to port 25 internal to see what is going on..!

Arrgghhhhh. :eek:

bschultz
20th December 2007, 06:21
Now this is happening to one of my new servers...any new ideas to fix it...anyone?

Thanks!

falko
21st December 2007, 13:29
Do you use SUSE on them as well?

VirtualAni
17th July 2008, 17:28
Same thing happening on a Debian Etch machine, I am working on with a friend. we followed the instructions using this how to - "Virtual Users And Domains With Postfix, Courier, MySQL And SquirrelMail (Ubuntu 8.04 LTS)"

This error disappears when I comment the line

content_filter = amavis:[127.0.0.1]:10024

in the main.cf

if I un-comment it, then the same error.... so am guessing I missed something...

falko
18th July 2008, 15:23
For Debian Etch you should use this one: http://www.howtoforge.com/virtual_users_and_domains_with_postfix_debian_etch

VirtualAni
18th July 2008, 15:28
Well we have the server up and running, I'd say that only a few component versions differ other then that, basically not much differences.

The server is up and running perfectly well, can send mail / receive as well. It's just a simple blog for a friend. The mail works, only when Amavis is enabled in main.cf via content_filter then it produces this error. Other then that if the line is commented and Amavis disabled, everything works perfectly well.

falko
19th July 2008, 23:10
What's the output of netstat -tap with the content_filter enabled?