Robots.txt Tester
Validate your robots.txt file, check for syntax errors, and view crawl rules.
Why validate robots.txt?
- Prevent accidental de-indexing of critical pages
- Ensure XML sitemaps are discoverable by bots
- Block crawling of sensitive admin or testing areas
- Review crawl budget optimization rules
- Fix syntax errors that bots might misinterpret
Frequently Asked Questions
What is a robots.txt file?
It is a text file placed in the root directory of your website (e.g., example.com/robots.txt) that gives instructions to web crawlers about which pages they can or cannot crawl.
Does Disallow mean 'No Index'?
No. 'Disallow' only blocks crawling. If a page has external links pointing to it, Google may still index the URL without the content. To prevent indexing completely, use the 'noindex' meta tag.
How do I specify my sitemap location?
Add a line at the bottom of your robots.txt file: Sitemap: https://yourdomain.com/sitemap.xml
What does User-agent: * mean?
The asterisk (*) is a wildcard that represents 'all web crawlers'. Rules following this directive apply to every bot (Googlebot, Bingbot, etc.) unless a more specific user-agent is defined.
What should I do if my robots.txt returns HTML?
This usually means the file doesn't exist, and your server is returning a custom 404 error page. You should create a plain text file named 'robots.txt' and upload it to your root web directory.