PDA

View Full Version : Monitor panel in multi server setup


Torch_za
21st September 2010, 18:46
It's surely an oversight but in multi server configuration the monitor function shows servers fully functional (green) when they are not even switched on! This is really unacceptable is it not?

e100
25th September 2010, 20:49
Someone correct me if I am wrong....

I believe that each individual server updates its own status when the cron runs.
What is displayed is whatever the last status update happened to be.

I think the panel should look and see when it last received status info and if it is outdated it could display "unknown status".

Torch_za
25th September 2010, 21:24
And how is that useful? If a server dies because it /loses power / explodes / bursts into flames / stops functioning altogether.. it's not an issue because the screen reflects the last status... and will do so for eternity. How is that of any use to the clients who are paying for services on that server or to the administrator of said server? Currently - if one of the services stops functioning the server bar is red .. if there are warnings it's orange... all very useful but if the server stops comlpletely it stays green? hmm ... there again .. having the said client who are paying for services screaming at one on the telephone and cancelling contracts might be a perfectly acceptable notification... http://www.howtoforge.com/forums/images/smilies/biggrin.gif

vogelor
27th September 2010, 20:13
Hy i am Oliver, the "core-programmer" of the monitoring-module.

it is very very hard to find a "down" server, because how to find it?
using a timestamp is not possible, because the time may differ on serveral server...

so til and i decided to create a external monitoring-solution.

i can't tell you, how fast this solution comes, but i think, it is coming

Torch_za
8th October 2010, 00:52
Hi,

I confess I'm no hot shot programmer but I think this problem is kinda solved with release 3.0.3 - withouth having to cross interrogate servers and process complicated routines .. if the slave is offline -the ISPConfig release number does not display in the control panel -
the display for a server THAT HAS BEEN SWITCHED OFF shows :

Server: ***.*********.***

State: ok (0 unknown, 0 info, 0 warning, 0 critical, 0 error)
More information...

if it is operational you get ..

Server: ***.*********.***
ISPConfig 3.0.3
State: ok (0 unknown, 0 info, 0 warning, 0 critical, 0 error)
More information...

all servers ar green and indicate no problems .. how hard can it be to show red if there is no ISPConfig release number? :D

vogelor
8th October 2010, 08:50
no sorry! this will not work.
the only thing you do is: to mark all servers red that are not up to date (means IPSConfig 3.0.2 or earlier)

Let me try to tell you more in detail:

1:00 pm
The monitor NEVER runs before and the database contains no data of server x

1:05 pm
This is the FIRST run of the monitor and AFTER the run, the database contains monitoring data of server x (state at 1:05pm)

1:10pm
This is the second run of the monitor. BEFORE the run, the database contains the data of server x (state at 1:05 pm). AFTER the run the database contains the data of server x (state at 1:10pm).

now the server crashes!

1:15pm
The monitor do not run. The database is not changed and the database contains the data of server x (state at 1:10pm)

.....

3:00pm
the server is still down, no monitor is running and no database update is made. the database still contains the data of server x (state at 1:10pm)


As you can see, THIS is the problem. the old data remains in the database (because the serverx can not delete it, because the serverx is down)


Do you now understand the problem?

once again, this is not really a bug, this is a problem with the architecture, we can not change.

Torch_za
8th October 2010, 10:00
Please accept my deepest apologies... I did confess from the outset that I was no programming genius. Ever since I stopped coding using Fortran back in the 90s it's been downhill.

Far be it for me, a humble and idiotic user to question a core programmer, but I am now wondering if you could suggest how to fix my multi server setup. I think I might have somehow downloaded a broken copy of 3.0.3 (although I did check and it seems to be current) when I updated a 5 server multi server site from 3.0.2. Either that or the problem is a lot bigger.

You see, when you said that the behaviour I outlined previously was impossible, I tried it again. I pulled the plug on a running slave server (stupid I know but it's a development server)to see what would happen. I watched the master server's monitor panel carefully and so help me you, are right. 5 minutes later the 'state indicators' are ok - all still showing no problem but the server is definitely off - the powerlead is hanging out the back. Now the problem appears that the ISPConfig version 3.0.3 flag insists on vanishing no matter what I do and THIS seems to be the problem if, like you say, there is no update performed.

Impossible you say but I'm not kidding. While you, the core programmers, insist that such behaviour is totally impossible and not wishing to be taken in by an operational anomaly I tried it again, and yet again with a different server yet the results were the same. I'm sure that every time you did this test your results were different, yet my ISPConfig release 3.0.3 still insists on defying logic.

To think, had I not suggested a simple tweak, like setting a background to red if the ISPConfig release number was not present, I would never have discover the problem at all. I am just so happy it was pointed out by the people who actually wrote the programme.

So I'm at a loss. How can I stop my system performing this impossibility? Do I reload all the servers from scratch or just the master server? Do I have to replace all the hardware? Is my system possibly posessed and a ritual exorcism be of use? Could it be that my setup has become self aware and starting to recode itself and possibly going to take over the world. I thought that storing copies of the terminator movies on the hard disk might not be a smart move at the time and now it looks like I might have given it ideas.