Robot.txt and SEO

by Vickram H 2013-08-27 16:45:08

Robots.txt Means?

The Robots Exclusion Protocol (REP) is a group of web standards that regulate web robot behavior and search engine indexing.

Pattern Matching:

Google and Bing both honor two regular expressions that can be used to identify pages or sub-folders that an SEO wants excluded.

* = which is a wildcard that represents any sequence of characters
$ = which matches the end of the URL

Block all web crawlers from all content:

User-agent: * Disallow: /

Block a specific web crawler from a specific folder:

User-agent: Googlebot Disallow: /no-google/

Block a specific web crawler from a specific web page:

User-agent: Googlebot Disallow: /no-google/blocked-page.html

Allow a specific web crawler to visit a specific web page:

Disallow: /no-bots/block-all-bots-except-rogerbot-page.html User-agent: rogerbot Allow: /no-bots/block-all-bots-except-rogerbot-page.html

Sitemap Parameter:

User-agent: * Disallow: Sitemap: http://www.example.com/none-standard-location/sitemap.xml

Tagged in:

1006
like
0
dislike
0
mail
flag

You must LOGIN to add comments