Munin Data gathering problem - Some nodes fail gathering data

Discussion in 'Server Operation' started by philzphil, Feb 5, 2014.

  1. philzphil

    philzphil New Member

    Hy Everybody

    I've run into a strange Problem with munin for which i didnt find any solution... maybe somebody here will be able to help.

    Im running the munin-master on a debian 7. Here's the config file for the master:

    dbdir	/var/lib/munin
    htmldir /var/www/
    logdir /var/log/munin
    rundir  /var/run/munin
    tmpldir	/etc/munin/templates
    staticdir /etc/munin/static
    includedir /etc/munin/munin-conf.d
        use_node_name yes
        use_node_name yes
    Here the node's config (kvm-server):

    log_level 4
    log_file /var/log/munin/munin-node.log
    pid_file /var/run/munin/
    background 1
    setsid 1
    user root
    group root
    ignore_file [\#~]$
    ignore_file DEADJOE$
    ignore_file \.bak$
    ignore_file %$
    ignore_file \.dpkg-(tmp|new|old|dist)$
    ignore_file \.rpm(save|new)$
    ignore_file \.pod$
    host_name kvmserver.mydomain
    port 4949
    The Problem is, that i can't gather data from my kvm-servers, running debian 7 as well and having no iptables rules visible with iptables -L --verbose. The munin-master has no iptables rules set as well.

    Other nodes in the network work, produce munin-data and can be gathered from the master without problems.

    The munin-master log shows me following error:

    2014/02/05 11:02:02 [FATAL] Socket read timed out to kvmserver.mydomain.  Terminating process. at /usr/share/perl5/Munin/Master/ line 254
    2014/02/05 11:02:02 [ERROR] Munin::Master::UpdateWorker<mydomain;kvmserver.mydomain> died with '[FATAL] Socket read timed out to kvmserver.mydomain.  Terminating process. at /usr/share/perl5/Munin/Master/ line 254
    No Problems are mentioned in the logs on the node's side.

    If i test the munin-node from the master with telnet, following problem occurs:
    telnet kvmserver.mydomain 4949
    Connected to kvmserver.mydomain.
    Escape character is '^]'.
    and then the connection freezes and after a long timeout, the connection is closed.

    I've tried to disable all the plugins except the uptime one... which works fine when i try it with
    munin-run uptime
    uptime.value 8.04
    on the node. But as soon as i want to gather the data from the master, it fails.

    Since i have the same version of munin-node running on other debian7 hosts and its working like charm, i've run out of clues to check...

    The Munin Version installed on all the boxes is 2.0.6-4+deb7u2 for the master and the nodes.

    If you know, what could be the problem, please help :eek:

