We have 2 servers (web and mail) running ISP Config 3.0.3.
All went great until last thursday at 6PM when Apache2 and MySql crashed on our web server.
Since then, this server randomly crashes (between 15 minutes and 3-4 hours), we found this in the kern.log :
Feb 25 21:09:16 ns3 kernel: [ 1037.526123] REISERFS error (device sda8): vs-2100 add_save_link: search_by_key ([-1 2362445 0x1001 DIRECT]) returned 1
Feb 25 21:09:16 ns3 kernel: [ 1037.526223] REISERFS (device sda8): Remounting filesystem read-only
Feb 25 21:09:16 ns3 kernel: [ 1037.526230] REISERFS warning (device sda8): clm-6006 reiserfs_dirty_inode: writing inode 2362445 on readonly FS
Each server have the same hardware, besides its quite new (bought in 2010 but used only since 2011) :
CPU : 1 x 2.26 Ghz (Intel xeon E5507)
RAM : 4GB
Data partition : /dev/sda8 /var 800GB used 20%
Log partition : /dev/sda9 /var/log 10G used 75%
Swap : 2GB
OS : Debian Squeeze (apt upgraded)
Web server programs :
bind (primary DNS)
ISP Config as master (mail server uses ISP Config as slave)
Web server is hosting approx 200 domains and websites
Mail server programs :
bind (secondary DNS)
At first I thought it was a swapping problem (it seems our provider set a too small swap partition), so I try to reduce RAM use : I set Apache MPM Prefork MaxClients to 40 instead of default which I think is 150 and fcgid FcgidMaxProcesses to 40 instead of previously set 100, I removed some useless apache modules and disabled fail2ban and postgresql which we don't really use for now.
Unfortunately, this didn't solve anything, server crashed about an hour after...
I tried to fix with :
reiserfsck --fix-fixable /dev/sda8
And then reboot, but it didn't solve neither.
We also tried to fix mysql databases (as we first thought it was a mysql issue) :
mysqlcheck -A -r -p
That indeed fixed many tables and improved a little mysql speed, but of course this didn't fix the crash.
Finally, we sent a ticket to our provider to tell them to try to fix the partition (our provider is theorically responsible for hardware issues)
On top of this, each time the web server (master) crashes, it output hundreds of MB of binary data in the /var/log/ispconfig/ispconfig.log and on the mail server, amavis also crashes (and messages get stuck in postqueue).
We commented the ISP Config server.sh cron task, so it "fixed" the huge logging issue, but the amavis crash still occur, I suppose it's related to the crash of the master MySql database...
Does this issue happened to any of you ? Does anyone have any idea about how we can fix that ?
We would really appreciate some help, because this is really a critical issue for us. Thank you for your help.