We will discuss robots.txt tips and tutorial in this post. Use power of robots txt file and guide/control search engine crawlers (spiders or robots). Your website should have a robots.tx file located at the root of your website so that it can be accessed as http://redefineinfotech.com/robots.txt or http://www.redefineinfotech.com/robots.txt .
How to create/generate robots.txt file
You can simply create a basic robots txt file or you can also generate it through Google webmaster tools.
1. Open a new txt file using notepad.
2. Write following code. The below code is used to allow all website pages for crawling.
3. Click save and use file name as robots.
4. Upload this file to your website root folder
5. Browse this file using path http://www.example.com/robots.txt or http://example.com/robots.txt whatever your preferred web address.
6. Now test your robots.txt file in Google Webmaster tools.
When you encounter 500 or 404 error on accessing this file then contact your webmaster or website developer.
Robots.txt Tips : How to control search engine crawlers?
Allow all webpages for crawling
Disallow specific path or folder for crawling
Robots.txt Wildcard Matching
Disallow query string URLs or extensions.
Disallow all URLs with query string
Disallow all URLs which ends with .asp
Robots.txt Advanced Tips
If you have very large website then you can use crawl delay function so that crawlers may not harm your website performance. Although you can use this feature in Google webmaster tools and set your website crawl priority.