apache suddenly jumps to maxclients and stays there
Good morning, I've been having some trouble with a webserver for a while. It's ubuntu 10.04, apache2,php5.3,mysql4
I've been hammering away at it and got it mostly fine but every so often, it'll suddenly jump up to the maxclients available for apache and stay there until I restart apache.
Usually it sits at about 100 clients, but whatever maxclients is, (be it 300,600 or 1000) occasionally it'll hit that and stay there, giving lots of this in the error logs
[warn] child process 1005 still did not exit, sending a SIGTERM
This is a vm, sitting all on its own on a powerful host so I've given it stupid amounts of ram while we get to the bottom of this, that's why it's mostly fine.
I'm pretty sure it's something in the code is causing this, but it's a massively complicated thing written by an outside source and not one of our guys, so poking about in it isn't straightforward.
Does anyone have any advice on how I might be able to narrow down the source of this issue or any changes I could make to apache to mitigate the effects?
Is it possible to see what an apache process is actually doing?
/edit, usually these issues are accompanied by a spike in mysql connections, processlist shows lots of local connections with the command "Sleep" I've set the wait_timeout to 60 seconds which seems to help a bit.
Also, graphs show spikes in apache connections with status "sending reply", but can I find out what they are replying to?
Do your access logs in the time window when this happens indicate anything?
Maybe you might also take a look at the server-status page of apache when this happens.
What is the loglevel you have set? Maybe debug (for a short period of time) will indicate more (in case it is not set, yet).
the trouble with the access logs is that there's so much guff spread over a whole bunch of identical sites, it's difficult to read anything useful.
But the server status page looks like it might bring up some useful info. I never even knew such a thing existed, how did I ever get this job in the first place?
I don't fancy putting additional strain on the server, so I'll leave debug logging off until I see if I can get any useful info from the server status
The question is, if that what you assume as helpful might not be and then there a no logs to analyze the past ;)
Just make sure that server-status / server-info pages are restricted in access appropriate.
You could also check any of the munin apache plugins in case you use munin (or similar monitoring stuff).
well, well, well. The server has spiked up twice. Each time, checking server-status has shown hundreds of links to the same two pages.
Once I got the server calmed down, I tried opening up one of those pages and within a minute the server was at the maximum connections and alerting it's little heart out.
Bad news is, this database is a monster so I can't figure out what's actually in those pages. I'll have to see if I can wrestle it from the developer.
In the meantime I suppose I could write a script to count the number of apache processes and then restart when it reaches the max, but that's a nasty way to go about it.
Any better suggestions? If not, thanks for your help anyway, I wouldn't have got this far without it.
But how does the db interfere with the raise of clients? Never the less you might want to log the DB queries during this time if you assume the application's backend DB causes this error.
How about blocking this specific request?
well, it's more like
that id number relates to an entry in the db and the page is generated from that
at least, that's what I think is happening. I haven't been able to figure out exactly what that id number relates to.
I'd love to block it, but I'm not sure how
/edit tried various rewrite rules in .htaccess but nothing worked, figured out how to mod the php to redirect these IDs though. That should hold it until the dev comes back
|All times are GMT +2. The time now is 12:49.|
Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.