Originally Posted by falko
Any errors in Apache's error log?
Can you tell me one of the affected domains so that I can do some tests?
Thanks for the reply, falko. What turned out to be a DDOS attack has subsided and things are back to normal. I PMed you some domain names.
We concluded this was caused by a DDOS attack. Here is what we know now.
The problem first began to appear around 8pm server time on Wednesday. By 7:30pm Thursday night the problem had disappeared and things were back to normal.
At first we assumed we were seeing a hacker or spam attack and possibly a password cracking attack on our server. It took hours to realize it was actually a DDOS attack. At first, we did what made sense to combat a hack attack. Later our strategy became focused on stopping a DDOS attack. Here is our diary of events and actions taken over the period of the attack.
- 10pm-12am Wed/Thurs - 2 hours after attack started and an hour before it reached its peak, changed all user passwords on server to 12 character alphanumerics. Average number of attempts per password required to crack these passwords: 1,106,838,475,932,200,000,000 -- just over 1 sextillion. If this was an attack on any of the server's domains or by email our enemies were in for some hard work.
- 12am-1am Thur - Conducted a site-by-site security and stability check of all sites on server. All sites appeared intact and secure. Server's domains are still inaccessible due to what appear to be DNS timeouts. Server seems to be VERY busy handling traffic. DNS server is holding its own. Experienced 3 or 4 (puTTY/ssh) server connection losses during this period.
- 1am-1:30am - With passwords secure and server sites stable, we shut down the most vulnerable and attackable apps on server -- including guestbooks, mysql, etc. Database passwords to all mysql apps were also changed. If domains can't be accessed, there's no sense inviting trouble. Still seeing 1 or 2 server connection losses per hour with (puTTY/ssh).
- 1:30am-2:30am - Posted inquiries and requests to trusted Linux support sites seeking advice on how to analyze, diagnose and solve the problem.
- 2:30am - Went to bed exhausted. This was going to take a while!
- 5:30am-7:30am - Rebooted server and conducted review of overnight traffic and logs. 40,000 errors had occurred overnight in email system alone due to password failures. The locally-installed mailman listserv app also experienced many errors but no successful security breaches could be identified. All user domains still seem intact and undamaged. This was when we began to believe a DDOS attack was the cause of the problem. Domains on server are still inaccessible -- apparently due to heavy net traffic. Despite the server password changes, attack had not subsided. Other than full log files damage seemed limited. The security barriers were holding up well. Checked web advice requests. No responses yet.
- 7:30am-11:30am - Posted a few more web advice requests. Responded to questions, suggestions and advice from hosting clients and web contacts. Conducted web-research on as-yet-unidentified server vulnerabilities, tools and methods to identify, analyze and fight DDOS attacks. Also researched server hardening strategies, techniques, tools and options. The news isn't good. Even the world's top DDOS experts say these attacks are tough to identify, hard to fight and can take many forms. The tools available to fight them are also limited and expensive. Ran a few tests, but made no server changes. There's no sense being caught with our pants down while we're under attack!
- 11:30am-12:30pm - Domains on server still inaccessible due to what appears to be heavy traffic. A second error review showed thousands of new errors from email server. These guys weren't giving up easy! Decided to shutdown Apache, email server, mailman listserv and the DNS server.
- 1:30pm-3:30pm - Traffic loads continued to fall. Server connection losses are not as frequent. Some local domain home pages do occasionally appear now. However, attack has not completely subsided. Left key server apps shutdown during this period.
- 3:30pm-6:30pm - Either the attack mitigation strategies were successful or the attack was timed to last 24 hours. Over these hours the intensity of the attack gradually subsided. As it did, we brought more of the server's apps back online. By 5:00pm, 50% of local domain requests were successful in displaying the domain's home page. Those that were MOST successful were the ones routed to the IP address allocated to the second DNS. A server damage assessment conducted after 18 hours showed no visible user domain or server damage or penetration.
- By 8:30pm, roughly 24 hours after the attack began, all domains were working again and the Apache/cpanel/WHM screen had disappeared.
: We now believe the Apache/cpanel/WHM screen we were seeing was being displayed by our server supplier's upstream DNS server because our local DNS server was failing to respond fast enough. The fact that we heard nothing from our server supplier during this period suggests ours was not the only server in their datacenter that was under attack...
What say you, falko? Did we get the situational, strategic and tactical analysis and combat techniques right or did we screw up somewhere?
Thanks again for your comments, thoughts, insights and suggestions.