What's the best way to block bots from searching your website?
I have created a robots.txt file which looks like this:
Code:
User-agent: *
Disallow: /
Disallow: /cgi-bin/
I have included the following in my index.html file:
Code:
<meta name="robots" content="NOINDEX, NOFOLLOW">
And I have also included an .htaccess file in my root which looks like this:
Code:
SetEnvIfNoCase User-Agent "^Yandex*" bad_bot
Order Deny,Allow
Deny from env=bad_bot
Yet I'm still seeing entries in Apache's access.log:
Code:
178.154.164.251 - - [10/Nov/2012:04:33:14 -0500] "GET /robots.txt HTTP/1.1" 200 324 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
178.154.164.251 - - [10/Nov/2012:04:33:14 -0500] "GET /phpbb/search.php?search_id=active_topics&sid=3a033d745efebc4ace615dd64e8f63f7 HTTP/1.1" 200 3735 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
178.154.164.251 - - [10/Nov/2012:04:33:17 -0500] "GET /phpbb/ucp.php?mode=login&sid=3a033d745efebc4ace615dd64e8f63f7 HTTP/1.1" 200 3513 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
66.249.76.173 - - [10/Nov/2012:06:05:11 -0500] "GET /robots.txt HTTP/1.1" 200 368 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
178.154.164.251 - - [10/Nov/2012:06:32:14 -0500] "GET /phpbb/index.php?sid=3a033d745efebc4ace615dd64e8f63f7 HTTP/1.1" 200 3908 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
123.125.71.74 - - [10/Nov/2012:06:35:02 -0500] "GET /robots.txt HTTP/1.1" 200 331 "-" "Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"
I have even included the IP address 178.154.164.251 in my iinbound filter list on my router. (The fact that I see that address still listed in my Apache logs suggests (at least to me) that Yandex isn't coming from that address.
Thoughts anyone?
Recent comments
2 hours 30 sec ago
11 hours 28 min ago
12 hours 17 min ago
15 hours 51 min ago
20 hours 15 min ago
20 hours 36 min ago
22 hours 46 min ago
1 day 8 hours ago
1 day 13 hours ago
1 day 15 hours ago