[solved] multiserver replication datalog_status_u_server job queue problem

Discussion in 'Installation/Configuration' started by linus, Mar 13, 2017.

  1. linus

    linus Member

    Hi all!

    I have followed the debian 8 multi server setup and all worked until a recent migration from the second server. (We saved many hours using the official migration script for moving ispconfig 2 and one other ispconfig 3). When updating the server configuration through the Ispconfig 3 gui for the server2 only the job queue increases and stays that way, from both server1 gui and serv2 gui. DNS:es and web settings seem to be working normally.
    ispconfig.jpg
    In the server2 /usr/local/ispconfig/server/lib/config.inc.php I have:
    //** Database
    $conf['db_type'] = 'mysql';
    $conf['db_host'] = 'localhost';
    $conf['db_port'] = '3306';
    $conf['db_database'] = 'dbispconfig2';
    $conf['db_user'] = 'ispconfig2';
    $conf['db_password'] = 'secret';

    ....
    $conf['dbmaster_type'] = 'mysql';
    $conf['dbmaster_host'] = 'localhost';
    $conf['dbmaster_port'] = '';
    $conf['dbmaster_database'] = 'dbispconfig';
    $conf['dbmaster_user'] = 'ispconfig';
    $conf['dbmaster_password'] = 'secret';
    ...
    Both are able to connect using the mysql command and reaching their databases ( dbispconfig2 and dbispconfig)

    server1 conf:
    $conf['db_type'] = 'mysql';
    $conf['db_host'] = 'localhost';
    $conf['db_port'] = '3306';
    $conf['db_database'] = 'dbispconfig';
    $conf['db_user'] = 'ispconfig';
    $conf['db_password'] = 'secret';
    ...

    //** Database settings for the master DB. This setting is only used in multiserver setups
    $conf['dbmaster_type'] = 'mysql';
    $conf['dbmaster_host'] = '';
    $conf['dbmaster_port'] = '';
    $conf['dbmaster_database'] = 'dbispconfig';
    $conf['dbmaster_user'] = '';
    $conf['dbmaster_password'] = 'oddpassword';
    ..
    Is there something missing from dbmaster_user?

    Both mysql slaves are running:
    server2:
    SHOW SLAVE STATUS \G
    Relay_Log_Pos: 71471865
    Relay_Master_Log_File: mysql-bin.000092
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    Using_Gtid: Slave_Pos
    Gtid_IO_Pos: 1-1-3229188

    server1:
    Relay_Log_Pos: 46343772
    Relay_Master_Log_File: mysql-bin.000057
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    Using_Gtid: Slave_Pos
    Gtid_IO_Pos: 2-2-1695068,0-2-982178

    I tried to php update.php the server2 as well but the config file stayed the same.

    /usr/local/ispconfig/server/server.sh (same output on both server1 and server2)
    Set Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    13.03.2017-19:05 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    13.03.2017-19:05 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    finished.

    Server2 dbispconfig.server update log position 27632, but datalog is 27702.
    Server1 update log position 27698 matches latest datalog.

    Could this problem be related to the config settings?

    I'm thankful for any help or hint anyone can offer.

    Update: I also attached the htf_report.txt file generated from instructions here https://www.howtoforge.com/community/threads/please-read-before-posting.58408/
     

    Attached Files:

    Last edited: Mar 13, 2017
  2. linus

    linus Member

    I think Server2 doesn't understand that it is a different server.
    If I change the debug value of Server1 (server id 1 in the ispconfigdb.server) to Debug, both servers start to save debug info. If I change the debug value of Server 2 (server id 11) the job queue is raised by two, but nothing else happens.

    Does anyone know how the servers identify what job belongs to them?

    Server2 has its ispconfig database named dbispconfig2 and there the server table has good looking values

    server_id 11, datalog_id 27626
    server_id 1, datalog_id 27733

    BUT nothing happens in the dbispconfig2.sys_datalog :\
    server id 11, datalog_id 122
    server id 1, datalog_id 118

    How are the jobs put to the dbispconfig2.sys_datalog from the master db?
     
    Last edited: Mar 13, 2017
  3. linus

    linus Member

    Would something like this help in both server's my.conf-files? Not that I understand why dbispconfig2.sys_datalog isn't updated.

    replicate-ignore-table=dbispconfig.server
    replicate-ignore-table=dbispconfig.monitor_data
     
  4. till

    till Super Moderator Staff Member ISPConfig Developer

    That's ok as only the primary server has a sys_datalog from where all servers of the cluster are reading their changes which then get integrated into the other tables of the local database by the server process.

    Both interfaces connect to dbispconfig (if you have two active interfaces), so they write into the replicated sys_datalog of the first database while the server process of server2 is connecting to dbispconfig2 and the server process of the first server connects to dbispconfig.
     
    linus likes this.
  5. linus

    linus Member

    Thank you for your answer! Shouldn't I be able to make any individual changes to Server2 configuration at all (now Server1's config settings affects both servers)? Without individual settings, should server name in the configuration be server1.example.com and server2.example.com or just server.example.com?
     
  6. iNet Specialists

    iNet Specialists New Member

    Short answer. Yes.
    I have had some similar experiences when moving/migrating ISPConfig 3 based servers around.
    Things to check:
    1. Database configuration settings are actually stored separately for purposes of each server and the ISPConfig Web UI. The server settings are in "usr/local/ispconfig/server/lib/config.inc.php" on each server, but the Web UI settings are in "usr/local/ispconfig/interface/lib/config.inc.php". If the Web UI is running on both systems, check that it is pointing at the appropriate DB on both servers (which would be the master DB for a redundant Web UI; note: that may require some special DB user permissions.)
    2. Database User permissions may not allow Server2 to appropriately read the master DB, or it may be able to read, but not write back that it has gotten the changes (i.e. update the datalog). I have had to use phpMyAdmin to fix user permissions across multi-server setups in the past. (Even when the DBs were supposedly replicating the schema/permissions.) Sometimes this can be caused by the 'user@host' recognition process within MySQL because the Database User was created with a hostname that doesn't match its actual location. (i.e. was create as dbispconfig@server2.example.com instead of dpispconfig@localhost)
    3. Normally, when a server is added to a multi-server setup, a specific Database User (like: ispconfigsrv2) is created with specific permissions to the exact tables in the master database that it needs to read/write. That is the Database User that should appear in the server/lib/config.inc.php file where the "dbmaster_" settings are. (check user@hostname as well, same as above)
    4. Finally, I have seen the occasional instance where "localhost" maps to IPv6 :):1) but not IPv4 (127.0.0.1), or iptables/iptables6 blocks mysql on one or the other network. It never hurts to check for firewall problems or host file entries.
    Each server should show up individually and have its own settings in the Web UI. In more recent configurations, I have used local private IPs to add servers to the multi-server setup in expert mode and then change the name in the Web UI Server Tab to show its use or real hostname.
     
    linus and Jesse Norell like this.
  7. linus

    linus Member

    Thank you for your time! I probably should have found the problem by now, and I'm really trying. :\

    The usr/local/ispconfig/interface/lib/config.inc.php is the same on both servers and both use the local replicated ispconfigdb in the //** Database section. If I try to change the first database section's database to ispconfig2 no changes are made and no job process balloon appears.

    For the sake of testing I replaced all the database users in both config files to root and root's password in server2 while I logged in to the server2 gui, to be sure all rights were sufficient, but stil server2's specific configurations were still not processed! All database connections from ispconfig are using localhost, I also tried fetching over the from the other server, but same result. I tried telnetting to port 3306 to ip4, ip6 and localhost and they all worked too.

    As both servers update the config settings for debug if I change server1's settings, can the problem really be a database problem? For me it looks like server2 thinks it is server1. If I change the debug value to debug on server2 nothing happens (no debug output is started), but if I change server1's debug settings both servers start to debug.

    If I manually set the server updated column for server2 the balloon disappears, of course, but the job is of course not executed.
    I don't know where to look now.
     
  8. till

    till Super Moderator Staff Member ISPConfig Developer

    Check that the hostname in /etc/hostname and in /etc/hosts is correct for each server and then protect /etc/hosts from being altered with chattr +i.
     
  9. iNet Specialists

    iNet Specialists New Member

    Good. That is as expected.
    It CAN be a database problem. It can ALSO be a configuration problem, a hostname problem, an IP address problem, or a replication problem. Maybe Server2 really does think it is Server1 because a basic network setup file has the wrong information in it, or maybe the migration from Server2 to Server1 copied something that it should not have (like a network file with a MAC address).
    When things like this happen, I usually start by re-checking everything starting with the network settings on both machines.
    See: https://www.howtoforge.com/debian-static-ip-address

    If that all looks OK, I would use the command line to log into MySQL (as root) and get a list of the users that exist in the schema.
    Code:
    # mysql -u root -p
    mysql> use mysql;
    mysql> select host, user from user;
    
    If users are created with a host of "localhost" and are accessing on "localhost" then,
    Code:
    # dig localhost 
    and make sure it resolves to the same place the server is listening on using
    Code:
    # netstat -lntp | grep 3306 
     
  10. linus

    linus Member

    Thank you again. I checked that the hosts are ok, and assigned the chattr -i attribute. The content on server2:
    Code:
    # cat /etc/hosts
    127.0.0.1       localhost
    SERVER1-IPv4 server1.example.com
    SERVER1-IPv6 server1.example.com
    SERVER2-IPv4 server2.example.com server2
    SERVER2-IPv6 server2.example.com server2
    
    # The following lines are desirable for IPv6 capable hosts
    ::1     localhost ip6-localhost ip6-loopback
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
    Yes, they database users are made with localhost.
    ....
    | localhost | web11u47 |
    | localhost | web12u1 |
    | localhost | web156u1 |
    | localhost | web164u1 |
    | localhost | web164u2 |
    ...
    Code:
    #dig localhost
    localhost.              10800   IN      A       127.0.0.1
    # dig localhost aaaa
    ;; ANSWER SECTION:
    localhost.              10800   IN      AAAA    ::1
    
    Code:
    # netstat -lntp | grep 3306
    tcp6       0      0 :::3306                 :::*                    LISTEN      11639/mysqld
    
    Code:
     
    # telnet localhost 3306
    Trying ::1...
    Connected to localhost.
    Escape character is '^]'.
    
    Does this give anyone any clues of what could be wrong?
     
  11. iNet Specialists

    iNet Specialists New Member

    Not directly, at least not yet.
    1 possibility, "dig" first reports 127.0.0.1 (IPv4) but "netstat" reports MySQL to be listening at :::3306 (IPv6). I am not convinced that "telnet" is a valid test since it may not resolve names the same way as PHP and may not use the "user@host" to delineate permissions for the databases/tables.
    a. You could try moving the IPv6 section to the top of the /etc/hosts file (that is where local resolution should be keyed to). Restart the networking.
    b. You could also try changing the $conf['db_host'] = 'localhost'; line in Server2 /usr/local/ispconfig/server/lib/config.inc.php to "::1" as opposed to "localhost".

    Note: This is what one of my typical servers looks like... Notice the order of the ANSWER and the ADDITIONAL
    Code:
    # dig localhost
    
    ; <<>> DiG 9.9.4-RedHat-9.9.4-38.el7_3.2 <<>> localhost
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49686
    ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 2
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 4096
    ;; QUESTION SECTION:
    ;localhost.  IN  A
    
    ;; ANSWER SECTION:
    localhost.  86400  IN  A  127.0.0.1
    
    ;; AUTHORITY SECTION:
    localhost.  86400  IN  NS  localhost.
    
    ;; ADDITIONAL SECTION:
    localhost.  86400  IN  AAAA  ::1
    
    ;; Query time: 1 msec
    ;; SERVER: 192.168.66.237#53(192.168.66.237)
    ;; WHEN: Thu Mar 16 09:34:06 CDT 2017
    ;; MSG SIZE  rcvd: 96
    
    # netstat -lntp | grep 3306
    tcp  0  0 0.0.0.0:3306  0.0.0.0:*  LISTEN  649/mysqld
    
    # ip addr show lo
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      inet 127.0.0.1/8 scope host lo
      inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
     
    linus likes this.
  12. iNet Specialists

    iNet Specialists New Member

    This makes me think that there are some other "strange" possibilities that you should really confirm.

    Note: These are only just ideas of WHY Server2 thinks it is Server1...

    1. Use the "ip" utility on both servers to confirm that there is not a duplicated MAC Address somewhere. Because MySQL is listening on ":::3306", that MAY include local IPv6 addresses that are generated automatically using the HWADDR of the interfaces. Effectively, both servers would have the same IPv6 local address on the same LAN. ( I have witnessed this in a VM environment and it has some really strange side effects. )
    Example of Local IPv6 relationship to MAC (HWADDR)
    Code:
    # ip addr show eth1
    eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
      link/ether YY:AA:BB:CC:DD:EE brd ff:ff:ff:ff:ff:ff
      inet6 fe80::YY(+/-)2AA:BBff:feCC:DDEE/64 scope link
    2. Is it possible that the Database for Server2 (dbispconfig2) is REALLY a COPY of the database for Server1 (dbispconfig)? Could this be the case on both servers? Or, just one server? Is it possible that the Replication process is the root cause?
    Of course, the only way I can think of confirming that is to do a "mysqldump" of both databases and using "diff" to compare the *.sql files. Row IDs may make them look different. You could try that on just the 'server' tables.

    3. Is it possible that the server.id table entry for Server2 is the same as the server.id table entry for Server1 causing Server2 to read the records for Server1 from the db_master when the server.sh runs? (I suppose that may have happened if the database for Server2 was renamed from 'dbispconfig' to 'dbispconfig2' sometime during or after migration/move.)
     
  13. linus

    linus Member

    Thank you very much for your help. Here are the results for the tests:
    a. Ok, I did, but it didn't change.
    b. $conf['db_host'] = '::1';
    generated:
    Code:
     PHP Warning:  mysqli_connect(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /usr/local/ispconfig/server/lib/classes/db_mysql.inc.php on line 79
    PHP Warning:  mysqli_connect(): (HY000/2002): php_network_getaddresses: getaddrinfo failed: Name or service not known in /usr/local/ispconfig/server/lib/classes/db_mysql.inc.php on line 79 
    (Although "mysql -uispconfig -pc12.....2 -h ::1" works well)

    Code:
    # ip addr show lo
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    
    When I added ::1 in top of the resolv.conf file I also get the fancy results:
    Code:
    # dig localhost
    
    ; <<>> DiG 9.9.5-9+deb8u10-Debian <<>> localhost
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29272
    ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 2
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 4096
    ;; QUESTION SECTION:
    ;localhost.                     IN      A
    
    ;; ANSWER SECTION:
    localhost.              604800  IN      A       127.0.0.1
    
    ;; AUTHORITY SECTION:
    localhost.              604800  IN      NS      localhost.
    
    ;; ADDITIONAL SECTION:
    localhost.              604800  IN      AAAA    ::1
    
    ;; Query time: 0 msec
    ;; SERVER: ::1#53:):1)
    ;; WHEN: Thu Mar 16 22:39:45 EET 2017
    ;; MSG SIZE  rcvd: 96
    
    
    Code:
     Server2
    # ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        link/ether 00:50:56:3c:f2:64 brd ff:ff:ff:ff:ff:ff
        inet 185.xxx.xxx.17/24 brd 185.2.100.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 2a02:c205:xxxx::1/64 scope global
           valid_lft forever preferred_lft forever
        inet6 fe80::250:56ff:fe3c:f264/64 scope link
           valid_lft forever preferred_lft forever
    
    Code:
    server1
    # ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        link/ether 00:50:56:3c:f2:5a brd ff:ff:ff:ff:ff:ff
        inet 79.xxx.xxx.245/24 brd 79.143.191.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 2a02:xxx:xxx:3893::1/64 scope global
           valid_lft forever preferred_lft forever
        inet6 fe80::250:56ff:fe3c:f25a/64 scope link
           valid_lft forever preferred_lft forever
    
    The database ispconfig and ispconfig2 are a bit different
    Code:
    server2, dbispconfig
    server table (same on both):
    id 11, server2,  update: 27772 (should be 27851)
    id: 1, server1, update: 27855 (ok)
    
    sys_datalog table (same on both)
    datalog_id: 27855 server_id:1
    datalog_id: 27851 server_id:11
    
    server2, dbispconfig2:
    server table (same on both )
    id 11, server2,  update: 27626 (should be 27851)
    id: 1, server1, update: 27855 (ok)
    
    sys_datalog (same on both)
    datalog_id:140,server_id:11
    datalog_id: 136, server_id: 1
    
    server1, dbispconfig2:
    server table (same on both)
    id 11, server2,  update: 27626 (should be 27851, different from ispconfig probably because I tried setting it manually)
    id: 1, server1, update: 27855 (ok)
    
    sys_datalog (same on both)
    datalog_id:140,server_id:11
    datalog_id: 136, server_id: 1
    
    server1, dbispconfig
    server table (same on both)
    id 11, server2, update  27851 (should be 27851)
    id 1, server1, update 27855 (ok)
    
    sys_datalog (same on both)
    datalog_id: 27855, server_id:1
    datalog_id: 27851, server_id: 11
    
    I didn't dump, but I think this info shows they are not copies, but maybe I didn't understand. Ths server ids look correct too.

    When update is set manually on dbispconfig2 database to correct server datalog one (27851) nothing happens, but when set in dbispconfig database the balloon disappears (regardless if from server1 or server2 mysql server).

    When making configuration changes to Server1 log id in server table gets updated to the correct automatically on both ispconfig and ispconfig2 databases, but not for Server2.

    Where in the PHP code the server understands what server id it should associate to when performing the server config?
    I hope the data I provided can be of any use. I'm really grateful for your efforts.
     
    Last edited: Mar 17, 2017
  14. iNet Specialists

    iNet Specialists New Member

    Re-check all of your config files and look for the line that has...
    Code:
    $conf['server_id'] = 
    Make sure that the correct 'server_id' from the config file matches the server that you are looking at the file on (according to your database values above).
     
    linus likes this.
  15. linus

    linus Member

    Days of troubleshooting has come to an end! Thank you iNet Specialists and Till for your insights!

    The best solutions are the simple ones (I have no idea why it wasn't right, but it is now):

    On server2:
    Code:
    /usr/local/ispconfig/server/lib/config.inc.php
    // $conf['server_id'] = '1'; pointed to server1
    $conf['server_id'] = '11'; // server2
    The job queue disappeared and I do not miss the red balloon =)
    ok.jpg
     
    Tuumke likes this.

Share This Page