Loading…
Fetch and parse any domain's robots.txt file, then test whether Googlebot is allowed or blocked on any specific URL path. See every user-agent rule, crawl-delay, and sitemap reference in one place.
🤖 Try the free Robots.txt Checker →A robots.txt checker fetches the robots.txt file from any domain, parses its user-agent rules, and evaluates whether a given URL path is allowed or blocked for specific crawlers — particularly Googlebot. It helps you verify that your robots.txt is working as intended, and that you haven't accidentally blocked pages you want indexed.
The robots.txt file sits at the root of every domain (/robots.txt) and acts as the first instruction set search engine crawlers read before crawling any page. Mistakes here — a single extra slash, a wildcard applied too broadly — can block entire sections of your site from Google indexation. A robots.txt checker makes these rules visible and testable before they cause ranking damage.
Try the free Robots.txt Checker
No login required. Results in seconds.
"Disallow: /" blocks every crawler from crawling every page. This is the most catastrophic robots.txt mistake and is surprisingly common — often the result of a developer testing on production with a CMS that generates this during maintenance mode, then never removing it.
Google uses the most specific matching rule for a given URL, not necessarily the first rule it encounters. A common mistake is adding "Allow: /api/public" expecting it to override an earlier "Disallow: /api", but without understanding the specificity matching logic, the result may be unexpected.
A critical distinction: blocking a URL in robots.txt prevents Googlebot from crawling it, but Google can still index a URL it cannot crawl if other pages link to it. The URL may appear in search results without a snippet, showing only the page title and URL. To prevent indexing, you need a noindex directive on the page itself.
The Crawl-delay directive tells crawlers to wait N seconds between requests. While helpful for preventing server overload, a high crawl-delay (>10 seconds) significantly slows Googlebot's indexation of large sites. Note: Google officially ignores the Crawl-delay directive, but Bing and other crawlers respect it.
Type your domain (e.g., https://yoursite.com). The tool auto-fetches the robots.txt file from the standard location.
If you want to check whether Googlebot is allowed to crawl a specific page, enter the path (e.g., /blog/my-post or /admin/settings). The checker evaluates all user-agent rules to give a clear allowed/blocked verdict for that path.
The parsed robots.txt is displayed in a readable format, grouped by user-agent. You can see which bots have specific rules, what paths are allowed and disallowed, and any crawl-delay settings.
Well-configured robots.txt files include "Sitemap:" directives pointing crawlers to your XML sitemaps. Verify these are present and pointing to valid, accessible sitemap files.
No account needed · Instant results
Yes, robots.txt supports two wildcards: * matches any sequence of characters, and $ matches the end of a URL. For example, "Disallow: /*.pdf$" blocks all URLs ending in .pdf. "Disallow: /wp-admin/" blocks /wp-admin/ and all sub-paths. Note that different crawlers implement wildcard support differently — Google and Bing support both wildcards, while some minor crawlers only support basic path matching.
robots.txt controls crawling — it tells Googlebot whether to fetch the page. A noindex meta tag controls indexing — it tells Google not to show the page in search results even after crawling it. To fully prevent a page from appearing in search, you need a noindex directive. If you use robots.txt to block a page, Google cannot read the noindex tag, so the page might still appear as a URL-only listing from links.
Without a robots.txt file, the server returns a 404, and crawlers interpret this as "no restrictions" — meaning all pages are crawlable. This is generally fine, but means you have no crawl-level controls. It's recommended to have a robots.txt file even if it only contains a Sitemap: directive.
Yes. You can specify rules for specific user-agents using the "User-agent:" directive. For example, "User-agent: Googlebot" applies only to Google's crawler, while "User-agent: *" applies to all crawlers. More specific user-agent rules take precedence over the wildcard rule for that bot. This lets you, for example, allow Googlebot everywhere while blocking scrapers.
No account required. Works on any public URL. Results in seconds.
Open free Robots.txt Checker →