Whole server went down

Discussion in 'General' started by wxman, Nov 17, 2009.

  1. wxman

    wxman New Member

    I only have a minute because I'm trying to get my server back up!

    It went down about an hour ago and the ony clue I see is in the apache log:
    Code:
    DBI connect('database=dbispconfig;host=localhost:3306','ispconfig',...) failed: Too many connections at /usr/local/ispconfig/server/scripts/vlogger line 255
    DBI Error:  at /usr/local/ispconfig/server/scripts/vlogger line 255.
    piped log program ' /usr/local/ispconfig/server/scripts/vlogger -s access.log -t "%Y%m%d-access.log" -d "/etc/vlogger-dbi.conf" /var/log/ispconfig/httpd' failed unexpectedly
    
    This shows up over and over.
     
  2. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    Increase the max_connections and max_user_connections setting to e.g. 500 in your mysql my.cnf and restart mysql.
     
  3. wxman

    wxman New Member

    I gave it a try, but I can't tell if it helped. I'm really stumped all day today. The server keeps locking up after being up for an hour or so. I won't be able to log in even to get to the logs, then all I can do is reboot. Stopping apache and mysql does nothing. When it's like that I was able to use top at a command line, and it showed the load averages all over 100. They usually hang around 1.
    I get the feeling it has something to do with either apache or mysql, but I can't find anything clear in the logs.
     
  4. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    Which processes cause the load?
     
  5. wxman

    wxman New Member

    That's half my problem. I'm obviously not as good at running a server as I hoped I was. I don't know now where to find that out.

    I've got top open right now, and the load is back to around 1. I do know when the last time it was it went crazy.
     
    Last edited: Nov 17, 2009
  6. wxman

    wxman New Member

    It just went down again. While I couldn't get to anything top looked like this:
    Code:
    top - 18:49:56 up  3:15,  2 users,  load average: 69.95, 71.26, 48.81
    Tasks: 260 total,   1 running, 259 sleeping,   0 stopped,   0 zombie
    Cpu(s):  0.8%us,  1.7%sy,  0.0%ni, 38.0%id, 59.1%wa,  0.0%hi,  0.0%si,  0.3%st
    Mem:   1575132k total,  1562748k used,    12384k free,     8300k buffers
    Swap:  7815580k total,  4844568k used,  2971012k free,    57876k cached
    PID to renice:
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    20918 www-data  20   0  270m  43m 3456 S    0  2.9   0:00.66 apache2
    20824 www-data  20   0  270m  32m 3444 S    0  2.1   0:00.60 apache2
     6392 mysql     20   0  446m  30m 2924 S    0  2.0   1:17.98 mysqld
    20801 www-data  20   0  270m  30m 3424 S    0  2.0   0:00.52 apache2
    20749 www-data  20   0  270m  28m 3424 D    0  1.9   0:00.74 apache2
    20730 www-data  20   0  270m  26m 3444 S    0  1.7   0:00.76 apache2
    20675 www-data  20   0  269m  25m 3416 S    0  1.7   0:00.92 apache2
    20710 www-data  20   0  270m  25m 3536 D    0  1.7   0:00.80 apache2
    20707 www-data  20   0  270m  25m 3476 S    0  1.6   0:00.56 apache2
    20690 www-data  20   0  270m  24m 3428 S    0  1.6   0:00.84 apache2
    20684 www-data  20   0  270m  22m 3480 D    0  1.4   0:00.68 apache2
    20641 www-data  20   0  270m  20m 3476 S    0  1.4   0:00.82 apache2
    20653 www-data  20   0  270m  20m 3480 D    0  1.3   0:00.92 apache2
    20539 www-data  20   0  270m  19m 4084 D    0  1.3   0:01.30 apache2
    20622 www-data  20   0  269m  18m 3460 S    0  1.2   0:01.06 apache2
    20624 www-data  20   0  269m  18m 3464 S    0  1.2   0:00.98 apache2
    20553 www-data  20   0  279m  18m 3544 D    0  1.2   0:01.16 apache2
    20623 www-data  20   0  270m  16m 3424 S    0  1.1   0:00.94 apache2
    20676 www-data  20   0  269m  16m 3416 S    0  1.1   0:00.98 apache2
    20302 www-data  20   0  270m  16m 3512 D    0  1.1   0:01.30 apache2
    20556 www-data  20   0  269m  15m 3812 S    0  1.0   0:00.94 apache2
    19881 www-data  20   0  270m  15m 4292 D    0  1.0   0:03.30 apache2
    21225 www-data  20   0  237m  15m 3360 S    0  1.0   0:00.16 apache2
    20682 www-data  20   0  270m  14m 3428 S    0  1.0   0:00.92 apache2
    20651 www-data  20   0  270m  14m 3428 S    0  1.0   0:00.98 apache2
    20225 www-data  20   0  268m  14m 3536 D    1  0.9   0:01.52 apache2
    20552 www-data  20   0  274m  14m 3524 S    0  0.9   0:00.92 apache2
    20373 www-data  20   0  279m  14m 3500 S    0  0.9   0:01.32 apache2
    19903 www-data  20   0  272m  14m 3920 D    0  0.9   0:03.46 apache2
    
    It took less than 5 minutes to go from running normally to overload. The traffic in and out stayed the same until it overloaded. After it did, the traffic died completely. There were a lot more apache COMMAND running but I didn't include it here.
     
    Last edited: Nov 18, 2009
  7. edge

    edge Active Member Moderator HowtoForge Supporter

    Wow..... load average: 69.95

    You could install munin (a howto is in the Howtos section).
    This might give you an indication on what is causing the high load.

    To low hardware memory and a lot of traffic could also cause high load.
     
  8. wxman

    wxman New Member

    At it's worst the average was around 170!
    I've looked into munin, and I'll look again. I've got 2GB of RAM which I thought should be plenty. I forgot to say that I'm already running Ganglia, which looks like does as much as Munin.
     
    Last edited: Nov 18, 2009
  9. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    Please check the apache access log, do you see if a specific URL causes these high amount of simultanious connections in your server?
     
  10. wxman

    wxman New Member

    No there isn't. My first thought was a DOS attack, so I checked the log. I don't see any one IP or URL the jumps out. Also the traffic seems steady right up to when it locks up. You can see it on ganglia. The graph goes from the bottom to over the top in 5 minutes. The only thing I saw the shocked me, was watching top. You can see the number of apache processes almost double in an instant.
    There was an increase in the total traffic for the day. On the 16th the total hits was 70890 and on the 17th it went to 113208.
     
    Last edited: Nov 18, 2009
  11. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    For me it looks like an attack on your server.

    Do you have any errors in the global apache error log or in the individual apache error logs that might be related to this?

    You should consider to set the max number of apache instances to a lowe value (see apache2.conf file or httpd.conf or one of the includes).
     
  12. wxman

    wxman New Member

    Hi Till

    You might not have seen what I added to the last post.
    There was an increase in the total traffic for the day. On the 16th the total hits was 70890 and on the 17th it went to 113208.
    This in apache2.conf:
    Code:
    <IfModule mpm_prefork_module>
        StartServers          5
        MinSpareServers       5
        MaxSpareServers      10
        MaxClients          200
        MaxRequestsPerChild   0
    </IfModule>
    
    I'm also using ISPConfig 3.
     
  13. wxman

    wxman New Member

    Till or anyone

    I'm in a real bind now. One of my web sites, the same one I was talking about here, is running a contest. She gets an increase in traffic, but not out of what I would expect. The as the traffic increases, The server goes crazy. The CPU and memory both go off the scale, and even top show load averages into the 100s!
    I have tried to fine tune the my.cnf and apache2.conf but it isn't helping. I have 1.5GB of RAM on this machine running Ubuntu with the server on a xen virtual machine. The processor is a core duo Pentium and the operating system is installed as 64bit. Any help would be appreciated.
     
    Last edited: Nov 20, 2009
  14. damir

    damir New Member

    What is the amount of traffic we are talking about here?

    Have you started to monitor the system to see where is the bottleneck?

    How many sites do you host and is is all in one server or multiserver install?
     
  15. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    Additionally to damirs questions, which php type did you enable for this website and do you have a php binary cache like eaccelerator or xcache installed to speed up php.
     
  16. wxman

    wxman New Member

    I thought I was running Zend Optimizer for an accelerator, but it's not listed when I checked phpinfo. I'm using mod-php for a type.

    haven't got the traffic stats for the moment of the crash. It happened just after midnight, and that's when I compile the webalizer stats.

    I'm currently on a single server, with only three active site. I'm using Ubuntu, and xen, with 1.5GB RAM. I'm going to double the RAM this week in hopes that might help. We have quite a few more sites to add.

    As for bottlenecks, I watched top as it locked up (see page one of this thread). All I could see is a lot of apache threads running at the same time. The site that caused this is using Drupal 6 too.
     
  17. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    Zend optimizer is not a real accelerator, it just optimizes a bit. I recommend that you install either eacceleretor or xcache beside zend optimizer.
     
  18. wxman

    wxman New Member

    Hi Till

    Have you heard if there is one that works better than the other with ISPConfig and/or Drupal?
    This is such a pain not being able to see in any log where this might be starting. I'm also going to check around the xen world to see if maybe it's something there I need to tweak.
     
  19. sajo

    sajo New Member

    Did you found solution, I have same problem for two weeks now, dont know what to do. i already add RAM so I have now 3Gb. only diference is that I am running ispconfig2.
    But the top output is totaly the same.
     
  20. till

    till Super Moderator Howtoforge Staff HowtoForge Supporter ISPConfig Developer

    ISPConfig 3 uses a completely different setup and logging process, so your problem can not be related to this as the software that caused this is not even installed on a ispconfig 2 setup. Please make a new thread in the ispconfig 2 forum and post an exact description of the problem incl. log and ps output.
     

Share This Page