Robots.txt Tester: Check Your Robots.txt File for Free (original) (raw)

Free SEO tools / Robots.txt Tester

Check your robots.txt file to make sure bots can crawl the site properly

Results

All bots(*)

AppleBot

AhrefsBot

Baiduspider

Bingbot

CCBot

ClaudeBot

DuckDuckBot

Googlebot (all)

Googlebot-Image

GoogleOther

GPTBot

Meta-ExternalAgent

Moz dotbot

OAI-Searchbot

PerplexityBot

SemrushBot

Slurp (Yahoo)

YandexBot

How to read a robots.txt file?

This directive identifies a specific spider (or all web crawlers) the prescribed rules apply to. Each search engine has its own bot: Google has Googlebot, Bing has Bingbot, and Yahoo! has Slurp. Most search engines have multiple spiders for their regular index, ad programs, images, videos, etc. The robots.txt validator will show which crawlers can or can't request your website content.

This directive specifies website files, categories, and pages that the designated crawlers may access. When no path is specified, the directive is ignored. It's used to counteract the Disallow directive, as in to allow access to a page or file within a disallowed directory. The robots.txt tester will show you which pages bots can access.

This directive is added to robots.txt to prevent search engines from crawling specific website files and URLs. You can disallow internal and service files, for example, a folder with user data specified during registration. The tool will show which of the entered pages are not allowed for crawling.

SE Ranking makes SEO easy!

Every tool you need under one roof

How to use our online Robots.txt Tester?

We created the robots.txt tester so that everyone can quickly check their file. To use our tool, paste the necessary URL into the input field and click Check your robots.txt. As a result, you will learn whether specific pages are allowed or blocked from crawling. A URL will be highlighted in red if it's blocked from crawling, and if the page is allowed to be crawled by bots, it'll be highlighted in green.

FAQ

Q

Why is a robots.txt file necessary?

A

Robots.txt files provide search engines with important information about crawling files and web pages. This file is used primarily to manage crawler traffic to your website in order to avoid overloading your site with requests.

You can solve two problems with its help:

However, if you want to prevent a page or another digital asset from appearing in Google Search, a more reliable option would be to add the no-index attribute to the robots meta tag.

Q

How to make sure robots.txt is working fine?

A

A quick and easy way to make sure your robots.txt file is working properly is to use special tools.

For example, you can validate your robots.txt by using our tool: enter up to 100 URLs and it will show you whether the file blocks crawlers from accessing specific URLs on your site.

To quickly detect errors in the robots.txt file, you can also use Google Search Console.

Q

Common robots.txt issues

A

Q

Robots.txt best practices

A

Use the proper case in robots.txt. Bots treat folder and section names as case-sensitive. So, if a folder name starts with a capital letter, naming it with a lowercase letter will disorient the crawler, and vice versa.

Each directive must begin on a new line. There can only be one parameter per line.

The use of space at the beginning of a line, quotation marks, or semicolons for directives is strictly prohibited.

There is no need to list every file you want to block from crawlers. You just need to specify a folder or directory in the Disallow directive, and all of the files from these folders or directories will also be blocked from crawling.

You can use regular expressions to create robots.txt with more flexible instructions.

Use server-side authentication to block access to private content. That way, you can ensure that important data is not stolen.

Use one robots.txt file per domain. If you need to set crawl guidelines for different sites, create a separate robots.txt for each one.

Q

What should be in a robots.txt file?

A

Robots.txt files contain information that instructs crawlers on how to interact with a particular site. It starts with a User-agent directive that specifies the search bot to which the rules apply. Then you should specify directives that allow and block certain files and pages from crawlers. At the end of a robots.txt file, you can optionally add a link to your sitemap.

Q

How to open a robots.txt file?

A

In order to access the content of any website’s robots.txt file, you have to type https://yourwebsite/robots.txt into the browser.

Q

Can bots ignore robots.txt?

A

Crawlers always refer to an existing robots.txt file when visiting a website. Although the robots.txt file provides rules for bots, it can’t enforce the instructions. The robots.txt file itself is a list of guidelines for crawlers—not strict rules. Therefore, in some cases, bots may ignore these directives.

Q

How to test if robots.txt is working properly?

A

You can check the robots.txt file with our tool. Just enter the necessary URLs. Here you’ll see if a given website URL is allowed or blocked from crawling.

Q

How do I fix robots.txt?

A

A robots.txt file is a text document. You can change the current file via a text editor and then add it again to the website root directory. What’s more, many CMS, including WordPress, have various plugins that allow making changes to the robots.txt file—you can do it directly from the admin dashboard.

Q

Can robots.txt be redirected?

A

The file can only be accessed at http://yourwebsite/robots.txt and cannot be redirected to other website pages. At the same time, you can set up a redirect to the robots.txt file of another domain.

Q

Does Google respect robots.txt?

A

When visiting a website, Google’s crawlers first refer to the robots.txt file containing all crawling guidelines. But in some cases, the search engine may ignore these directives.