Robots attack revenue bank

Robot.txt file, be very careful with this file. If you are managing the hosting of your own website be sure that you know what this file is and how to use it.

The Robot.txt file tells web spider bots to either enter the site and take note of what pages and content are available or it tells it to take a hike and go pester some other website.

It can be helpful to keep certain pages from being crawled, private information that you may not want shared. However, if not coded right it could tell google and any other search engine bot not to enter. If this occurs you could immediately be dropped from any rankings you had.

I’m not too sure what happened with my site. I was ranking third in Google for some keywords “internet grocery shopping dublin” but I fell from the rankings. In fact I feel off the rankings completely. Yahoo still kept me, and I still rank number one :) I think it was an automatic upgrade to my website that entered the Robot.txt file. It wasn’t previously on my site folder.

Below is some information that should be helpful:

To exclude all robots from the entire server

User-agent: *
Disallow: /

To allow all robots complete access

User-agent: *
Disallow:
(or just create an empty “/robots.txt” file, or don’t use one at all)

To exclude all robots from part of the server

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

To exclude a single robot

User-agent: BadBot
Disallow: /
To allow a single robot

User-agent: Google
Disallow:

User-agent: *
Disallow: /

To exclude all files except one

This is currently a bit awkward, as there is no “Allow” field. The easy way is to put all files to be disallowed into a separate directory, say “stuff”, and leave the one file in the level above this directory:

User-agent: *
Disallow: /~joe/stuff/

Alternatively you can explicitly disallow all disallowed pages:
User-agent: *
Disallow: /~joe/junk.html
Disallow: /~joe/foo.html
Disallow: /~joe/bar.html

Anyhow appartently while I’m on the subject. Be careful when you change the URL of your website. If you have two different URL’s pointing to the same site Google drops you from rankings also. A 301 redirect is what is needed. For more information on how to change the URL of your old site, please visit here.

Leave a Reply

You must be logged in to post a comment.