Go Back   HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials > Linux Forums > Technical

Do you like HowtoForge? Please consider supporting us by becoming a subscriber.
Reply
 
Thread Tools Display Modes
  #1  
Old 18th April 2010, 21:22
Torsson Torsson is offline
Member
 
Join Date: Mar 2006
Posts: 62
Thanks: 0
Thanked 3 Times in 3 Posts
Default problems accessing server's

Hi, i have 2 identical servers both with ubuntu and Xen to host virtual servers on.. and on the both dom0 i have drbd, nfs and heartbeat to get files replicated between them both..

And now to the problem
Both the servers stops working sometimes and refuse to do anything, but i can still ping them both. but can't access any port like http, ssh, ftp. and i have checked syslog and kernel log but can not se anything that tells me that something went wrong.

The both servers is have now the same problem and i can't write anything from the logs.

Next weekend im gona go to the place where they are hosted (450 km from here). and it would be helpfull if i could get any hints about what the problem could be.

i have a gut feeling that this has something about drbd/nfs to do becaus one time when i was moving 30-40 gb to it both died in the same way.

Any tips are welcome.
Reply With Quote
Sponsored Links
  #2  
Old 19th April 2010, 11:10
falko falko is offline
Super Moderator
 
Join Date: Apr 2005
Location: Lüneburg, Germany
Posts: 41,701
Thanks: 1,900
Thanked 2,739 Times in 2,574 Posts
Default

I'd install munin on both servers ( http://www.howtoforge.com/server-mon...n-debian-lenny ). That should make it easier to track down where the problem comes from (e.g. full hard drive, not enough memory or swap, etc.).
__________________
Falko
--
Download the ISPConfig 3 Manual! | Check out the ISPConfig 3 Billing Module!

FB: http://www.facebook.com/howtoforge

nginx-Webhosting: Timme Hosting | Follow me on:
Reply With Quote
  #3  
Old 19th April 2010, 15:00
Mosquito Mosquito is offline
Member
 
Join Date: Nov 2006
Posts: 85
Thanks: 5
Thanked 6 Times in 5 Posts
Default

Are you using a bridge with static IPs to the Virtual hosts or a bridge with DHCP assigned IPs?

I had this problem on a CentOS 5.4 server with Xen when I was using DHCP assigned IPs. Switched to static (outside of my DHCP range, but still within the subnet) and everything started working fine.
Reply With Quote
  #4  
Old 19th April 2010, 16:31
Torsson Torsson is offline
Member
 
Join Date: Mar 2006
Posts: 62
Thanks: 0
Thanked 3 Times in 3 Posts
Default

Falko, ill try that

Mosquito, all the virtual servers and xen host's have static ip address


Saw something strange today on the servers.. tried to access ssh but it dosen't answer anyting.. not "No route to host" or "Refused". so the machine seems to be alive still.

Last edited by Torsson; 19th April 2010 at 16:40.
Reply With Quote
  #5  
Old 25th April 2010, 20:59
Torsson Torsson is offline
Member
 
Join Date: Mar 2006
Posts: 62
Thanks: 0
Thanked 3 Times in 3 Posts
Default

This is the only strange thing i find in the server log. Can this be the reason that the server is always dieing?

Quote:
Mar 17 20:05:14 wendecoserver1 kernel: [ 702.561454] lockd: cannot monitor web2.local
Mar 17 20:05:45 wendecoserver1 kernel: [ 733.737945] lockd: cannot monitor web2.local
Mar 17 20:47:03 wendecoserver1 kernel: [ 3208.065621] lockd: cannot monitor web1.local
Mar 17 20:47:09 wendecoserver1 kernel: [ 3213.902570] lockd: cannot monitor web1.local
Mar 17 20:47:11 wendecoserver1 kernel: [ 3216.011962] lockd: cannot monitor web1.local
Mar 17 20:47:14 wendecoserver1 kernel: [ 3219.030206] lockd: cannot monitor web1.local
Mar 17 20:47:39 wendecoserver1 kernel: [ 3244.468641] lockd: cannot monitor web1.local
Mar 17 20:47:50 wendecoserver1 kernel: [ 3255.201669] lockd: cannot monitor web2.local
Mar 17 20:47:50 wendecoserver1 kernel: [ 3255.201934] lockd: cannot monitor web2.local
Mar 17 20:47:50 wendecoserver1 kernel: [ 3255.316007] lockd: cannot monitor web2.local
Mar 17 20:48:19 wendecoserver1 kernel: [ 3284.415093] lockd: cannot monitor web2.local
Mar 17 20:51:00 wendecoserver1 kernel: [ 3444.730822] lockd: cannot monitor web1.local
Mar 17 20:51:11 wendecoserver1 kernel: [ 3455.753788] lockd: cannot monitor web1.local
Mar 17 20:51:29 wendecoserver1 kernel: [ 3473.305364] lockd: cannot monitor web1.local
Mar 17 20:54:52 wendecoserver1 kernel: [ 3676.525687] lockd: cannot monitor web2.local
Mar 17 20:54:52 wendecoserver1 kernel: [ 3676.525858] lockd: cannot monitor web2.local
Reply With Quote
  #6  
Old 6th May 2010, 21:12
Torsson Torsson is offline
Member
 
Join Date: Mar 2006
Posts: 62
Thanks: 0
Thanked 3 Times in 3 Posts
Default

I think i have found the problem.

I installed the same version of Xen (3.3) on Ubuntu 9.10, same as the server and i get exacly the same problem as on the servers.

the problem is that when the kernel is booting it just get black screen when it is starting/been online for a few hours. i can ping the server but nothing else.

im using kernel: linux-image-2.6.24-16-xen_2.6.24-16.30zng1_i386

So im gona try to install Xen 4.0.0 with a newer kernel and se if that works
Reply With Quote
  #7  
Old 6th May 2010, 23:19
edge edge is offline
Moderator
 
Join Date: Dec 2005
Location: The Netherlands
Posts: 2,034
Thanks: 264
Thanked 151 Times in 131 Posts
Default

When they are down are you trying to access the server(s) by domain or by IP?
If by domain it could be a DNS isue.
__________________
Never execute code written on a Friday or a Monday.
Reply With Quote
  #8  
Old 3rd June 2010, 10:02
Rapid2214 Rapid2214 is offline
Senior Member
 
Join Date: Jun 2010
Posts: 105
Thanks: 3
Thanked 5 Times in 5 Posts
Default

Also do they go down at the same time?
Reply With Quote
  #9  
Old 11th June 2010, 07:28
Torsson Torsson is offline
Member
 
Join Date: Mar 2006
Posts: 62
Thanks: 0
Thanked 3 Times in 3 Posts
Default

Now this night i got the same problems. both died almost at the same time. between 6-7 am. the second server died first and everyting was pointed to the first server.

Yes i try to ssh directly to the ip and not to the dns
Reply With Quote
  #10  
Old 12th June 2010, 16:17
falko falko is offline
Super Moderator
 
Join Date: Apr 2005
Location: Lüneburg, Germany
Posts: 41,701
Thanks: 1,900
Thanked 2,739 Times in 2,574 Posts
 
Default

Maybe this is caused by a cron job? Maybe by one of the scripts in /etc/cron.daily/?
__________________
Falko
--
Download the ISPConfig 3 Manual! | Check out the ISPConfig 3 Billing Module!

FB: http://www.facebook.com/howtoforge

nginx-Webhosting: Timme Hosting | Follow me on:
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Management/system config/settings & /server/settings not working!! dactor Installation/Configuration 9 6th February 2008 09:11
Problems to receive mail from external servers ideafix Installation/Configuration 5 8th January 2008 08:44
Unable send receive emails vassilis3 Installation/Configuration 15 19th May 2007 14:34
No SPF record. beryl Installation/Configuration 6 17th May 2007 19:52
Empty Recycle Bin jon335 General 40 6th May 2006 11:56


All times are GMT +2. The time now is 17:20.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.