Home arrow News arrow Latest arrow SEG Bootcamp: Robots.txt File
Jan 09 2008
SEG Bootcamp: Robots.txt File PDF Print E-mail
Thursday, 10 January 2008

Robots.txt files are often mentioned as being an important foundation of a search friendly web site. To site owners and small businesses who are new to search marketing, the robots.txt file can sound daunting. In reality, it's one of the fastest, simplest ways to make your site just a little more search engine friendly.

(SEG Bootcamp articles are no-frills content designed to bring small business owners up to speed on the concepts and techniques needed to market their businesses online.)

What is Robots.txt?

Robots.txt is a simple text file that sits on the server with your web site. It's basically your web site's way of giving instructions to search engines about what how they index your web site.

Search Engines tend to look for the robots.txt file when they first visit a site. They can visit and index your site whether you have a robots.txt file or not; having one simply helps them along the way.

All of the major search engines read and follow the instructions in a robots.txt file. That means it's a pretty effective way to keep content out of the search indexes.

A word of warning. While some sites will tell you to use robots.txt to block premium content you don't want people to see, this isn't a good idea. While most search engines will respect your robots.txt file and ignore the content you want to have blocked, a far safer option is to hide that premium content behind a login. Requiring a username and password to access the content you want hidden from the public will do a much more effective job of keeping both search engines and people out.

What Does Robots.txt Look Like?

The average robots.txt file is one of the simplest pieces of code you'll ever write or edit.

If you want to have a robots.txt file for the engines to visit, but don't want to give them any special instructions, simply open up a text editor and type in the following:

User-Agent: *
Disallow:

The "User-Agent" part specifies which search engines you are giving the directions to. Using the asterisk means you are giving directions to ALL search engines.

The "disallow" part specifies what content you don't want the search engines to index. If you don't want to block the search engines from any area of your web site, you simply leave this area blank.

For most small web sites, those two simple lines are all you really need.

If your web site is a little bit larger, or you have a lot of folders on your server, you may want to use the robots.txt file to give some instructions about which content to avoid.

A good example of this would be a site that has printer-friendly versions of all of their content housed in a folder called "print-ready." There's no reason for the search engines to index both forms of the content, so it's a good idea to go ahead and block the engines from indexing the printer-friendly versions.

In this case, you'd leave the "user-agent" section alone, but would add the print-ready folder to the "disallow" line. That robots.txt file would look like this:

User-Agent: *
Disallow: /print-ready/

It's important to note the forward slashes before and after the folder name. The search engines will tack that folder on to the end of the domain name they are visiting.

That means the /print-ready/ file is found at www.yourdomain.com/print-ready/. If it's actually found at www.yourdomain.com/css/print-ready/ you'll need to format your robots.txt this way:

User-Agent: *
Disallow: /css/print-ready/

You can also edit the "user-agent" line to refer to specific search engines. To do this, you'll need to look up the name of a search engine's robot. (For instance, Google's robot is called "googlebot" and Yahoo's is called "slurp.")

If you want to set up your robots.txt file to give instructions ONLY to Google, you would format it like this:

User-Agent: googlebot
Disallow: /css/print-ready/

How do I Put Robots.txt on my Site?

Once you've written your robots.txt file to reflect the directions you want to give the search engines, you simply save the text file as "robots.txt" and upload it to the root folder of your web site.

It's that simple.

Want to learn more? Check out these resources:

Official Google Blog: Controlling How Search Engines Access and Index Your Website

The Web Robots Page


Read more at: http://www.searchengineguide.com/jennifer-laycock/seg-bootcamp-robotstxt-file.php.
Comments (9)Add Comment
seo toronto
written by hossein, July 17, 2008
I read this site and it is very greate.
take a look here:
search engine optimizatioin company
seo toronto
written by hossein, July 17, 2008
I read this site and it is very greate.
take a look here:
search engine optimizatioin company
...
written by Hire SEO Expert, September 11, 2008
what if i do not want to create a new file for robots.txt. but still i want a a page that will not be crawled by spider. Should i have any options or code which i can put in meta tags?


Thanks
http://www.viteb.com
SEO service India
written by cyberThink InfoTech Pvt. Ltd., January 23, 2009
Hello,
Excellent post share here ...
Really I feel that information share here very useful.
Search Marketing Agency, SEO Marketing Agency, SEO Agency
written by SEO agency, August 26, 2009
Robots.txt is a simple text file that sits on the server with your web site. It's basically your web site's way of giving instructions to search engines about what how they index your web site. nice information.
Data recovery software
written by Mayra Clark, September 29, 2009
Thanks for giving this valuable information. i am really thankful to you.

Regards
Mayra
http://www.recoverybull.com
...
written by jack, May 06, 2010
This is such a great work to share your ideas and experiences to everybody.your article is full of knowledge and information which is good for the learners.
reply
written by LoisDay, June 07, 2010
Make your life easier take the http://www.lowest-rate-loans.com and all you need.
nike air force 1
written by nike air force 1 , June 23, 2010
The post about leisure is quiet good. Reduce our work pressure, improve our life mood . I like it very much.I will support it often. These days I want to buy something from relea these websites ,but I don’t know how to do.I hope you can help me.Thank you very much!

Write comment
quote
bold
italicize
underline
strike
url
image
quote
quote
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley

busy
 
Strategie di mercato