Use ROBOTS.TXT to control search engine indexing
Posted on Friday, July 14, 2006 at 4:02 PM
A robot (also called a spider) is an automated software program which scans web pages (as well as newsgroups and other Internet structures) looking for things. There are hundreds, if not thousands, of robots tirelessly scanning the internet day and night. The result of their toils is often beneficial (they allow massive search engines like Google to exist). Sometimes their purposes are merely interesting (as with internet mapping robots), and occasionally they are actually malicious and evil (as with email harvesters).
So, how is the file formulated ?
Here is a link to a FREE tool that allows you to create a robots.txt file for your domain, the file needs to be placed in the ROOT of your domain, that means it must be where your index page is. Practically every site needs a robots.txt file. The search engines use it to index your site, and you can specify whether the robots do or don't spider your site, or parts of it. So in general it's always good to have a file like that. To access the generator click this link. Then copy and paste the result into a text file called robots.txt and upload it to root of your web site.
