Robots.txt Tester

Validate your robots.txt file, check for syntax errors, and view crawl rules.

Why validate robots.txt?

  • Prevent accidental de-indexing of critical pages
  • Ensure XML sitemaps are discoverable by bots
  • Block crawling of sensitive admin or testing areas
  • Review crawl budget optimization rules
  • Fix syntax errors that bots might misinterpret

Frequently Asked Questions

What is a robots.txt file?

It is a text file placed in the root directory of your website (e.g., example.com/robots.txt) that gives instructions to web crawlers about which pages they can or cannot crawl.

Does Disallow mean 'No Index'?

No. 'Disallow' only blocks crawling. If a page has external links pointing to it, Google may still index the URL without the content. To prevent indexing completely, use the 'noindex' meta tag.

How do I specify my sitemap location?

Add a line at the bottom of your robots.txt file: Sitemap: https://yourdomain.com/sitemap.xml

What does User-agent: * mean?

The asterisk (*) is a wildcard that represents 'all web crawlers'. Rules following this directive apply to every bot (Googlebot, Bingbot, etc.) unless a more specific user-agent is defined.

What should I do if my robots.txt returns HTML?

This usually means the file doesn't exist, and your server is returning a custom 404 error page. You should create a plain text file named 'robots.txt' and upload it to your root web directory.