Websites and ISPConfig unreachable - ISP and server normal

Discussion in 'General' started by unsichtbare, Dec 2, 2020.

  1. unsichtbare

    unsichtbare Member HowtoForge Supporter

    I have reached a situation over the last couple of days where, at least once per day - but never at the same time, my websites and the ISPConfig CP become unreachable for about 5 minutes. There are no corresponding entries in the Apache2/error.log file, but I have posted time-relevant entries from syslog. Also, I am able to connect with SSH and TOP looks normal during outages and there are no ISP outages to account for. Also, this is a static IP, and single-site ISPConfig installation, fully updated on Ubuntu 18.04 LTS, also fully updated.
    THX,
    -JB
    Syslog for that period:
    Code:
    Dec  2 15:09:01 hosting systemd[1]: Starting Clean php session files...
    Dec  2 15:09:01 hosting systemd[1]: Started Clean php session files.
    Dec  2 15:10:01 hosting CRON[25742]: (web1) CMD (/usr/bin/php -q /var/www/clients/client1/vmsources.com/web/support/api/cron.php >/dev/null 2>&1 #vmsources.com)
    Dec  2 15:10:01 hosting CRON[25743]: (root) CMD (/usr/local/ispconfig/server/cron.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:10:01 hosting CRON[25744]: (root) CMD (/usr/local/ispconfig/server/server.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:10:01 hosting CRON[25746]: (getmail) CMD (/usr/local/bin/run-getmail.sh > /dev/null 2>> /dev/null)
    Dec  2 15:10:01 hosting CRON[25748]: (web1) CMD (/usr/bin/php -q /var/www/clients/client1/web1/web/share/cron.php >>/var/www/clients/client1/web1/private/cron.log 2>>/var/www/clients/client1/web1/private/cron_error.log #vmsources.com)
    Dec  2 15:10:03 hosting pure-ftpd: ([email protected]) [INFO] New connection from 127.0.0.1
    Dec  2 15:10:03 hosting pure-ftpd: ([email protected]) [INFO] Logout.
    Dec  2 15:10:03 hosting dovecot: imap-login: Disconnected (disconnected before auth was ready, waited 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<KJGUTYDuytcfFIYUFouyFGOjulyg>
    Dec  2 15:10:03 hosting postfix/smtpd[25831]: connect from localhost[127.0.0.1]
    Dec  2 15:10:03 hosting postfix/smtpd[25831]: lost connection after CONNECT from localhost[127.0.0.1]
    Dec  2 15:10:03 hosting postfix/smtpd[25831]: disconnect from localhost[127.0.0.1] commands=0/0
    Dec  2 15:10:03 hosting dovecot: pop3-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<alkgHLKughLKghILug:LKJGh>
    Dec  2 15:11:01 hosting CRON[25850]: (root) CMD (/usr/local/ispconfig/server/server.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:11:01 hosting CRON[25851]: (root) CMD (/usr/local/ispconfig/server/cron.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:12:01 hosting CRON[25880]: (root) CMD (/usr/local/ispconfig/server/server.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:12:01 hosting CRON[25881]: (root) CMD (/usr/local/ispconfig/server/cron.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:13:01 hosting CRON[25918]: (root) CMD (/usr/local/ispconfig/server/cron.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:13:01 hosting CRON[25919]: (root) CMD (/usr/local/ispconfig/server/server.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:14:01 hosting CRON[25952]: (root) CMD (/usr/local/ispconfig/server/server.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:14:01 hosting CRON[25953]: (root) CMD (/usr/local/ispconfig/server/cron.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:15:01 hosting CRON[26009]: (root) CMD (/usr/local/ispconfig/server/cron.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:15:01 hosting CRON[26008]: (root) CMD (/usr/local/ispconfig/server/server.sh 2>&1 | while read line; do echo `/bin/date` "$line" >> /var/log/ispconfig/cron.log; done)
    Dec  2 15:15:01 hosting CRON[26012]: (getmail) CMD (/usr/local/bin/run-getmail.sh > /dev/null 2>> /dev/null)
    Dec  2 15:15:02 hosting pure-ftpd: ([email protected]) [INFO] New connection from 127.0.0.1
    Dec  2 15:15:02 hosting pure-ftpd: ([email protected]) [INFO] Logout.
    Dec  2 15:15:02 hosting dovecot: pop3-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<Y0ketXy1ItR/AAAB>
    Dec  2 15:15:02 hosting dovecot: imap-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<2VAetXy1cKd/AAAB>
    Dec  2 15:15:02 hosting postfix/smtpd[26087]: connect from localhost[127.0.0.1]
    Dec  2 15:15:02 hosting postfix/smtpd[26087]: lost connection after CONNECT from localhost[127.0.0.1]
    Dec  2 15:15:02 hosting postfix/smtpd[26087]: disconnect from localhost[127.0.0.1] commands=0/0
    
    Code:
    [email protected]:/var/log# df -h
    Filesystem      Size  Used Avail Use% Mounted on
    udev            3.9G     0  3.9G   0% /dev
    tmpfs           798M  1.2M  797M   1% /run
    /dev/sda2       196G  104G   82G  56% /
    tmpfs           3.9G     0  3.9G   0% /dev/shm
    tmpfs           5.0M  4.0K  5.0M   1% /run/lock
    tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
    /dev/loop0       98M   98M     0 100% /snap/core/9289
    /dev/loop1       94M   94M     0 100% /snap/core/9066
    tmpfs           798M     0  798M   0% /run/user/0
    [email protected]:/var/log#
    
     
  2. Jesse Norell

    Jesse Norell ISPConfig Developer Staff Member ISPConfig Developer

    You might check /var/log/ispconfig/cron.log and ispconfig.log from that same timeframe and run 'dmesg' for more clues. Check logs for mysql errors. Is the cli sluggish or as responsive as normal? What are the most recent entries in your web server access logs at the time? Does restarting apache or php daemon(s) immediately resolve it?
     
    unsichtbare likes this.
  3. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    Do you monitor free memory, disk utilization and system load? Are they high or low during that outage?
     
    unsichtbare likes this.
  4. unsichtbare

    unsichtbare Member HowtoForge Supporter

    I don't usually look at this log, but it looks pretty consistent for the timeframe in question:
    Code:
    Wed Dec 2 15:09:01 UTC 2020 finished server.php.
    Wed Dec 2 15:10:01 UTC 2020 02.12.2020-15:10 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    Wed Dec 2 15:10:01 UTC 2020 02.12.2020-15:10 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    Wed Dec 2 15:10:01 UTC 2020 finished server.php.
    Wed Dec 2 15:11:01 UTC 2020 02.12.2020-15:11 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    Wed Dec 2 15:11:01 UTC 2020 02.12.2020-15:11 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    Wed Dec 2 15:11:01 UTC 2020 finished server.php.
    Wed Dec 2 15:12:01 UTC 2020 02.12.2020-15:12 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    Wed Dec 2 15:12:01 UTC 2020 02.12.2020-15:12 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    Wed Dec 2 15:12:01 UTC 2020 finished server.php.
    Wed Dec 2 15:13:01 UTC 2020 02.12.2020-15:13 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    Wed Dec 2 15:13:01 UTC 2020 02.12.2020-15:13 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    Wed Dec 2 15:13:01 UTC 2020 finished server.php.
    Wed Dec 2 15:14:01 UTC 2020 02.12.2020-15:14 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    Wed Dec 2 15:14:01 UTC 2020 02.12.2020-15:14 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    Wed Dec 2 15:14:01 UTC 2020 finished server.php.
    Wed Dec 2 15:15:01 UTC 2020 02.12.2020-15:15 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    Wed Dec 2 15:15:01 UTC 2020 02.12.2020-15:15 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock
    Wed Dec 2 15:15:01 UTC 2020 finished server.php.
    
    I've not had the opportunity to try restarting Apache during the outage as it only lasts about 5 minutes.
     
  5. unsichtbare

    unsichtbare Member HowtoForge Supporter

    I ran top during the last outage and saw basically no usage on the server - memory or otherwise. Disks have plenty of space.
     
  6. unsichtbare

    unsichtbare Member HowtoForge Supporter

    Access logs may be more revealing. I see a bunch of 503, 405, 301, 302 during the time of the outage:
    Code:
    66.102.7.200 - - [02/Dec/2020:15:10:48 +0000] "GET / HTTP/1.1" 200 14009 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36 Google (+https://developers.google.com/+/web/snippet/)"
    209.58.160.49 - - [02/Dec/2020:15:09:38 +0000] "GET / HTTP/1.1" 503 13313 "-" "Site24x7"
    185.153.248.168 - - [02/Dec/2020:15:10:09 +0000] "GET / HTTP/1.1" 503 13313 "-" "Site24x7"
    185.56.90.140 - - [02/Dec/2020:15:10:09 +0000] "GET / HTTP/1.1" 503 13313 "-" "Site24x7"
    54.36.149.35 - - [02/Dec/2020:15:10:31 +0000] "GET /vmware-training/class-schedule/week.listevents/2015/12/07/- HTTP/1.1" 404 38388 "-" "Mozilla/5.0 (compatible; AhrefsBot/7.0; +http://ahrefs.com/robot/)"
    121.244.91.46 - - [02/Dec/2020:15:10:10 +0000] "GET / HTTP/1.1" 503 13313 "-" "Site24x7"
    65.117.249.69 - - [02/Dec/2020:15:10:22 +0000] "GET /share/error/503.html HTTP/1.1" 503 1294 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36"
    114.119.132.23 - - [02/Dec/2020:15:10:26 +0000] "GET /manual/de/ru/new_features_2_0.html HTTP/1.1" 503 4143 "-" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://aspiegel.com/petalbot)"
    24.207.174.97 - - [02/Dec/2020:15:10:32 +0000] "GET /share/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.1" 302 1037 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36 Edg/87.0.664.52"
    24.207.174.97 - - [02/Dec/2020:15:10:36 +0000] "GET /share/error/503.html HTTP/1.1" 503 1294 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36 Edg/87.0.664.52"
    192.168.99.118 - - [02/Dec/2020:15:10:38 +0000] "GET /share/error/503.html HTTP/1.1" 503 5553 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:83.0) Gecko/20100101 Firefox/83.0"
    192.168.99.101 - - [02/Dec/2020:15:11:03 +0000] "GET /share/error/503.html HTTP/1.1" 503 1294 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:11:05 +0000] "PROPFIND /share/remote.php/dav/files/9FB3528E-9FFC-45CC-B7E8-3D3E4A19EB36/ HTTP/1.1" 302 4762 "-" "Mozilla/5.0 (Windows) mirall/3.0.3stable-Win64 (build 20201029) (Nextcloud)"
    192.168.99.118 - - [02/Dec/2020:15:11:12 +0000] "GET /share/ocs/v2.php/apps/notifications/api/v2/notifications?format=json HTTP/1.1" 302 4762 "-" "Mozilla/5.0 (Windows) mirall/3.0.3stable-Win64 (build 20201029) (Nextcloud)"
    119.82.29.196 - - [02/Dec/2020:15:10:09 +0000] "GET / HTTP/1.1" 503 4843 "-" "Site24x7"
    150.109.167.115 - - [02/Dec/2020:15:10:10 +0000] "GET / HTTP/1.1" 503 4843 "-" "Site24x7"
    192.168.99.101 - - [02/Dec/2020:15:08:29 +0000] "GET /share/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.1" 500 1466 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    24.207.174.97 - - [02/Dec/2020:15:11:36 +0000] "GET /share/error/503.html HTTP/1.1" 503 1294 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36 Edg/87.0.664.52"
    192.168.99.118 - - [02/Dec/2020:15:11:53 +0000] "PROPFIND /share/remote.php/dav/files/25468A63-A341-4034-BA73-553D8ED7CDAE/ HTTP/1.1" 302 4762 "-" "Mozilla/5.0 (Windows) mirall/3.0.3stable-Win64 (build 20201029) (Nextcloud)"
    192.168.99.101 - - [02/Dec/2020:15:12:05 +0000] "PROPFIND /share/remote.php/dav/files/9FB3528E-9FFC-45CC-B7E8-3D3E4A19EB36/ HTTP/1.1" 302 4762 "-" "Mozilla/5.0 (Windows) mirall/3.0.3stable-Win64 (build 20201029) (Nextcloud)"
    65.117.249.69 - - [02/Dec/2020:15:07:32 +0000] "GET /share/ocs/v2.php/apps/notifications/api/v2/notifications HTTP/1.1" 500 1482 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:12:29 +0000] "PUT /share/index.php/apps/user_status/heartbeat HTTP/1.1" 302 1037 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/503.html HTTP/1.1" 302 1074 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    192.168.99.101 - - [02/Dec/2020:15:13:33 +0000] "PUT /share/error/405.html HTTP/1.1" 302 510 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"
    
     
  7. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    By disk utilization I did not mean free space available, but intensity of disk reads and writes. Commands like iotop and other *top commands are useful.
    Also check network use, with nettop for example.
    The load number is important, it may be the host does not seem to do anything, but load is 20. That means processes are waiting for something, you must figure out what is the bottleneck. The load number should be below N*0,7 where N is number of processor cores.
     
    unsichtbare likes this.
  8. Jesse Norell

    Jesse Norell ISPConfig Developer Staff Member ISPConfig Developer

    You can install atsar to read system info, and be able to look back at the stars to see if some resource was short, or unusual activity like dusk load, memory swapping, etc. It won't tell you what processes we involved, but might dig up some clues as to their effect.
     
  9. Jesse Norell

    Jesse Norell ISPConfig Developer Staff Member ISPConfig Developer

    Were you able to confirm that apache was still running at the time?
     
  10. unsichtbare

    unsichtbare Member HowtoForge Supporter

    No, I was not able to view Apache status.
    Interesting correlation:
    Outage seems to occur when an IPsec VPN to an AD Domain re-establishment (every 28800 seconds) takes more than a few seconds. On most cycles, the VPN is re-established instantly with no disruption - but sometimes it takes several seconds and this seems to be the disruption.
    The only reason for the VPN is that one of the sub-domains uses LDAP to authenticate users of the sub-domain - however all of the sub-domains are offline when the one is offline - this is the problem. I have verified that only LDAP traffic is going over the VPN.
    THX,
    -John​
     
  11. Jesse Norell

    Jesse Norell ISPConfig Developer Staff Member ISPConfig Developer

    Maybe setup monit to restart apache (and/or php) when it no longer gets a successful http reply? Or at least do that for the interim, till you're able to troubleshoot exactly what's going on.
     
    ahrasis likes this.
  12. unsichtbare

    unsichtbare Member HowtoForge Supporter

    I was able to check Apache during an outage today (VPN down) and Apache was running:
    Code:
    [email protected]:~# systemctl status apache2
    ● apache2.service - The Apache HTTP Server
       Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
      Drop-In: /lib/systemd/system/apache2.service.d
               └─apache2-systemd.conf
       Active: active (running) since Tue 2020-11-10 21:17:45 UTC; 4 weeks 0 days ago
      Process: 4607 ExecReload=/usr/sbin/apachectl graceful (code=exited, status=0/SUCCESS)
     Main PID: 6137 (apache2)
        Tasks: 38 (limit: 4915)
       CGroup: /system.slice/apache2.service
               ├─ 1123 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─ 3480 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─ 4715 vlogger (access log)
               ├─ 4716 /usr/sbin/apache2 -k start
               ├─ 6137 /usr/sbin/apache2 -k start
               ├─ 6469 /usr/sbin/apache2 -k start
               ├─ 9390 /usr/sbin/apache2 -k start
               ├─10319 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10343 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10344 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10345 /usr/sbin/apache2 -k start
               ├─10346 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10352 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10354 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10356 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10358 /usr/sbin/apache2 -k start
               ├─10359 php-cgi7.4 -d open_basedir=/var/www/clients/client1/web1/web:/var/www/clients/client1/web1/private:/var/www/clients/client1/web1/tmp:/var/www/clients/client1/web1/web/adm
               ├─10361 /usr/sbin/apache2 -k start
               ├─10395 /usr/sbin/apache2 -k start
               ├─10396 /usr/sbin/apache2 -k start
               ├─10397 /usr/sbin/apache2 -k start
               ├─10406 /usr/sbin/apache2 -k start
               ├─10407 /usr/sbin/apache2 -k start
               ├─10408 /usr/sbin/apache2 -k start
               ├─10653 /usr/sbin/apache2 -k start
               ├─10654 /usr/sbin/apache2 -k start
               ├─10761 /usr/sbin/apache2 -k start
               ├─10763 /usr/sbin/apache2 -k start
               ├─10766 /usr/sbin/apache2 -k start
               ├─11003 /usr/sbin/apache2 -k start
               ├─11004 /usr/sbin/apache2 -k start
               ├─11028 /usr/sbin/apache2 -k start
               ├─11029 /usr/sbin/apache2 -k start
               ├─11030 /usr/sbin/apache2 -k start
               ├─11031 /usr/sbin/apache2 -k start
               ├─11033 /usr/sbin/apache2 -k start
               ├─11038 /usr/sbin/apache2 -k start
               └─11042 /usr/sbin/apache2 -k start
    
    Dec 06 06:25:01 hosting.mysite.com systemd[1]: Reloaded The Apache HTTP Server.
    Dec 07 06:25:02 hosting.mysite.com systemd[1]: Reloading The Apache HTTP Server.
    Dec 07 06:25:02 hosting.mysite.com apachectl[10200]: AH00548: NameVirtualHost has no effect and will be removed in the next release /etc/apache2/sites-enabled/000-ispconfig.vhost:7
    Dec 07 06:25:02 hosting.mysite.com systemd[1]: Reloaded The Apache HTTP Server.
    Dec 08 06:25:02 hosting.mysite.com systemd[1]: Reloading The Apache HTTP Server.
    Dec 08 06:25:02 hosting.mysite.com apachectl[3566]: AH00548: NameVirtualHost has no effect and will be removed in the next release /etc/apache2/sites-enabled/000-ispconfig.vhost:7
    Dec 08 06:25:02 hosting.mysite.com systemd[1]: Reloaded The Apache HTTP Server.
    Dec 09 06:25:02 hosting.mysite.com systemd[1]: Reloading The Apache HTTP Server.
    Dec 09 06:25:02 hosting.mysite.com apachectl[4607]: AH00548: NameVirtualHost has no effect and will be removed in the next release /etc/apache2/sites-enabled/000-ispconfig.vhost:7
    Dec 09 06:25:02 hosting.mysite.com systemd[1]: Reloaded The Apache HTTP Server.
    Literally the only thing on the VPN is LDAP authentication against a remote AD Domain.
    THX,
     
  13. unsichtbare

    unsichtbare Member HowtoForge Supporter

    Could it be that the php7.1-ldap apache extension freezes everything when one subdomain can not reach the configured ldap server?
     
  14. Jesse Norell

    Jesse Norell ISPConfig Developer Staff Member ISPConfig Developer

    Are there a lot of php processes running and not obviously doing anything? If so, attach to one (or maybe try a few) with strace and see what's going on.
     
  15. ahrasis

    ahrasis Well-Known Member HowtoForge Supporter

    Could be. Personally php "freezes" have been my problems ever since and I have to monitor via monit to ensure they are restarted if stopped for any reasons.
     
    unsichtbare likes this.

Share This Page