Robots.txt Introduction and Guide | Google Search Central  |  Documentation  |  Google for Developers (original) (raw)

A robots.txt file tells search engine crawlers which URLs the crawler can access on your site. This is used mainly to avoid overloading your site with requests; it is not a mechanism for keeping a web page out of Google. To keep a web page out of Google,block indexing with noindex or password-protect the page.

What is a robots.txt file used for?

A robots.txt file is used primarily to manage crawler traffic to your site, and_usually_ to keep a file off Google, depending on the file type:

robots.txt effect on different file types
Web page You can use a robots.txt file for web pages (HTML, PDF, or othernon-media formats that Google can read), to manage crawling traffic if you think your server will be overwhelmed by requests from Google's crawler, or to avoid crawling unimportant or similar pages on your site. If your web page is blocked with a robots.txt file, its URL can still appear in search results, but the search result willnot have a description. Image files, video files, PDFs, and other non-HTML files embedded in the blocked page will be excluded from crawling, too, unless they're referenced by other pages that are allowed for crawling. If you see this search result for your page and want to fix it, remove the robots.txt entry blocking the page. If you want to hide the page completely from Search, useanother method.
Media file Use a robots.txt file to manage crawl traffic, and also to prevent image, video, and audio files from appearing in Google search results. This won't prevent other pages or users from linking to your image, video, or audio file. Read more about preventing images from appearing on Google. Read more about how to remove or restrict your video files from appearing on Google.
Resource file You can use a robots.txt file to block resource files such as unimportant image, script, or style files, if you think that pages loaded without these resources will not be significantly affected by the loss. However, if the absence of these resources make the page harder for Google's crawler to understand the page, don't block them, or else Google won't do a good job of analyzing pages that depend on those resources.

Understand the limitations of a robots.txt file

Before you create or edit a robots.txt file, you should know the limits of this URL blocking method. Depending on your goals and situation, you might want to consider other mechanisms to ensure your URLs are not findable on the web.

Create or update a robots.txt file

If you decided that you need one, learn how tocreate a robots.txt file. Or if you already have one, learn how toupdate it.

Want to learn more? Check out the following resources: