Recently one of our readers asked us for tips on how to optimize the robots.txt file to improve SEO. Robots.txt file tells search engines how to crawl your website which makes it an incredibly powerful SEO tool. In this article, we will show you how to create a perfect robots.txt file for SEO.
What is robots.txt file?
Robots.txt is a text file that website owners can create to tell search engine bots how to crawl and index pages on their site.
It is typically stored in the root directory also known as the main folder of your website. The basic format for a robots.txt file looks like this:
1 2 3 4 5 6 7 8 | User-agent: [user-agent name] Disallow: [URL string not to be crawled] User-agent: [user-agent name] Allow: [URL string to be crawled] Sitemap: [URL of your XML Sitemap] |
You can have multiple lines of instructions to allow or disallow specific URLs and add multiple sitemaps. If you do not disallow a URL, then search engine bots assume that they are allowed to crawl it.
Here is what a robots.txt example file can look like:
1 2 3 4 5 6 | User-Agent: * Allow: /wp-content/uploads/ Disallow: /wp-content/plugins/ Disallow: /wp-admin/ Sitemap: https: //example.com/sitemap_index.xml |
In the above robots.txt example, we have allowed search engines to crawl and index files in our WordPress uploads folder.
After that, we have disallowed search bots from crawling and indexing plugins and WordPress admin folders.
Lastly, we have provided the URL of our XML sitemap.
Do You Need a Robots.txt File for Your WordPress Site?
If you don’t have a robots.txt file, then search engines will still crawl and index your website. However, you will not be able to tell search engines which pages or folders they should not crawl.
This will not have much of an impact when you’re first starting a blog and do not have a lot of content.
However as your website grows and you have a lot of content, then you would likely want to have better control over how your website is crawled and indexed.
Here is why.
Search bots have a crawl quota for each website.
This means that they crawl a certain number of pages during a crawl session. If they don’t finish crawling all pages on your site, then they will come back and resume crawl in the next session.
This can slow down your website indexing rate.
You can fix this by disallowing search bots from attempting to crawl unnecessary pages like your WordPress admin pages, plugin files, and themes folder.
By disallowing unnecessary pages, you save your crawl quota. This helps search engines crawl even more pages on your site and index them as quickly as possible.
Another good reason to use robots.txt file is when you want to stop search engines from indexing a post or page on your website.
It is not the safest way to hide content from the general public, but it will help you prevent them from appearing in search results.
What Does an Ideal Robots.txt File Should Look Like?
Many popular blogs use a very simple robots.txt file. Their content may vary, depending on the needs of the specific site:
1 2 3 4 5 | User-agent: * Disallow: Sitemap: http: //www.example.com/post-sitemap.xml Sitemap: http: //www.example.com/page-sitemap.xml |
This robots.txt file allows all bots to index all content and provides them a link to the website’s XML sitemaps.
For WordPress sites, we recommend the following rules in the robots.txt file:
1 2 3 4 5 6 7 8 9 | User-Agent: * Allow: /wp-content/uploads/ Disallow: /wp-content/plugins/ Disallow: /wp-admin/ Disallow: /readme.html Disallow: /refer/ Sitemap: http: //www.example.com/post-sitemap.xml Sitemap: http: //www.example.com/page-sitemap.xml |
This tell search bots to index all WordPress images and files. It disallows search bots from indexing WordPress plugin files, WordPress admin area, the WordPress readme file, and affiliate links.
By adding sitemaps to robots.txt file, you make it easy for Google bots to find all the pages on your site.
Now that you know what an ideal robots.txt file look like, let’s take a look at how you can create a robots.txt file in WordPress.
How to Create a Robots.txt File in WordPress?
There are two ways to create a robots.txt file in WordPress. You can choose the method that works best for you.
Method 1: Editing Robots.txt File Using Yoast SEO
If you are using the Yoast SEO plugin, then it comes with a robots.txt file generator.
You can use it to create and edit a robots.txt file directly from your WordPress admin area.
Simply go to SEO » Tools page in your WordPress admin and click on the File Editor link.
On the next page, Yoast SEO page will show your existing robots.txt file.
If you don’t have a robots.txt file, then Yoast SEO will generate a robots.txt file for you.
By default, Yoast SEO’s robots.txt file generator will add the following rules to your robots.txt file:
1 2 | User-agent: * Disallow: / |
It is important that you delete this text because it blocks all search engines from crawling your website.
After deleting the default text, you can go ahead and add your own robots.txt rules. We recommend using the ideal robots.txt format we shared above.
Once you’re done, don’t forget to click on the ‘Save robots.txt file’ button to store your changes.
Method 2. Edit Robots.txt file Manually Using FTP
For this method, you will need to use an FTP client to edit robots.txt file.
Simply connect to your WordPress hosting account using an FTP client.
Once inside, you will be able to see the robots.txt file in your website’s root folder.
If you don’t see one, then you likely don’t have a robots.txt file. In that case, you can just go ahead and create one.
Robots.txt is a plain text file, which means you can download it to your computer and edit it using any plain text editor like Notepad or TextEdit.
After saving your changes, you can upload it back to your website’s root folder.
How to Test Your Robots.txt File?
Once you have created your robots.txt file, it’s always a good idea to test it using a robots.txt tester tool.
There are many robots.txt tester tools out there, but we recommend using the one inside Google Search Console.
Simply login to your Google Search Console account, and then switch to the old Google search console website.
This will take you to the old Google Search Console interface. From here you need to launch the robots.txt tester tool located under ‘Crawl’ menu.
The tool will automatically fetch your website’s robots.txt file and highlight the errors and warnings if it found any.
Final Thoughts
The goal of optimizing your robots.txt file is to prevent search engines from crawling pages that are not publicly available. For example, pages in your wp-plugins folder or pages in your WordPress admin folder.
A common myth among SEO experts is that blocking WordPress category, tags, and archive pages will improve crawl rate and result in faster indexing and higher rankings.
This is not true. It’s also against Google’s webmaster guidelines.
We recommend that you follow the above robots.txt format to create a robots.txt file for your website.
We hope this article helped you learn how to optimize your WordPress robots.txt file for SEO. You may also want to see our ultimate WordPress SEO guide and the best WordPress SEO tools to grow your website.
If you liked this article, then please subscribe to our YouTube Channel for WordPress video tutorials. You can also find us on Twitter and Facebook.
Very nicely described about robot.text, i am very happy
u r very good writer
Thank you, glad you liked our article
What is Disallow: /refer/ page ? I get a 404, is this a hidden wp file?
We use /refer/ to redirect to various affiliate links on our website. We don’t want those to be indexed since they’re just redirects and not actual content.
Thank you for sharing. This was really helpful for me to understand robots.txt
I have updated my robots.txt to the ideal one you suggested. i will wait for the results now
You’re welcome, glad you’re willing to use our recommendations
Very helpful article. Thank you very much.
Glad our article was helpful
Thanks for share this useful information about us.
Glad we could share this information about the robots.txt file
thanks for update information for me. Your article was good for Robot txt. file. It gave me a piece of new information. thanks and keep me updating with new ideas.
Glad our guide was helpful
Thanks , I added robots.txt in WordPress .Very good article
Thank you, glad our article was helpful
Thanks for this – how does it work on a WP Multisite thou?
For a multisite, you would need to have a robots.txt file in the root folder of each site.
My wordpress site is new and my robot.txt by default is
user-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
I want google to crawl and index my content. Is that robot.txt okay?
You can certainly use that if you wanted
Great Airticle…
I was confused from so many days about Robots.txt file and Disallow links. Have copied the tags for robots file. Hope this will solve the issue of my Site
We hope our article will help as well
The files in the screenshots of your home folder are actually located under the public_html folder under my home folder.
I did not have a /refer folder under my public_html folder.
I did not have post or page xml files anywhere on my WP account.
I did include an entry in the robots.txt file I created to disallow crawling my sandbox site. I’m not sure that’s necessary since I’ve already selected the option in WP telling crawlers not to crawl my sandbox site, but I don’t think it will hurt to have the entry.
Some hosts do rename public_html to home which is why you see it there. You would want to ensure Yoast is active for the XML files to be available. The method in this article is an additional precaution to help with preventing crawling your site
Great article
Thank you
Hello, such a nice article you solve my problem. So Thank You so much
Glad our article could help
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php <– This is my robot.txt code but im confuse why my /wp-admin is index? How to no index it?
If it was indexed previously you may need to give time for the search engine’s cache to clear
This website really inspire me to start a blog .Thank you lost of tema.this website each and every article have rich of information and explanation.when i have some problem at first i visit this blog . Thank You
Glad our articles can be helpful
I am trying to optimise robots for my website using Yoast. However Tools in Yoast does not have the option for ‘File Editor’.
There are just two options
(i) Import and Export
(ii) Bulk editor
May you please advise how this can be addressed. Could it be that I am on a free edition of Yoast?
The free version of Yoast still has the option, your installation may be disallowing file editing in which case you would likely need to use the FTP method.
I really find this article helpful because I really don’t know much on how robot.txt works but now I do.
pls what I don’t understand is how do I find the best format of robot.txt to use on my site (I mean one that works generally)?
I noticed lots of big blogs I check ranking high on search engine uses different robot.txt format..
I’ll be clad to see a reply from you or just anyone that can help
Having a sitemap and allowing the areas that need to be allowed is the most important part. The disallow part will vary based on each site. We shared a sample in our blog post, and that should be good for most WordPress sites.
Hey Emmanuel,
Please see the section regarding the ideal robots.txt file. It depends on your own requirements. Most bloggers exclude WordPress admin and plugin folders from the crawl.
Thank you so much.
now I understand. I guess I’ll start with the general format for now.
Well written article, I recommend the users to do sitemap before creating and enabling their ROBOTS text it will help your site to crawl faster and indexed easily.
Jack
I would like to stop the search engines from indexing my archives during their crawl.
Thanks alot this article it was really helpful
I keep getting the error message below on google webmaster. I am basically stuck. A few things that were not clear to me on this tutorial is where do I find my site’s root files, how do you determine if you already have a “robots.txt” and how do you edit it?
Hi Cherisa,
Your site’s root folder is the one that contains folders like wp-admin, wp-includes, wp-content, etc. It also contains files like wp-config.php, wp-cron.php, wp-blogheader.php, etc.
If you cannot see a robots.txt file in this folder, then you don’t have one. You can go a head and create a new one.
Thank you for your response. I have looked everywhere and can’t seem to locate these root files as you describe. Is there a path directory I can take that leads to this folder. Like it is under Settings, etc?
I had a decent web traffic to my website. Suddenly dropped to zero in the month of May. Till now I have been facing the issue. Please help me to recover my website.
Hello There Thank you For This Information, But I Have A Question
That I Just Create The Sitemap.xml and Robots.txt File, & Its Crawling well. But How Can I Create “Product-Sitemap.xml”
There is all list of product in sitemap.xml file. Do I Have To Create Product-sitemap.xml separately?
and submit to google or bing again ?
Can You please Help me out…
Thank You
I have a problem on robots.txt file setting. Only one robots.txt is showing for all websites. Please help me to show separate robots.txt file of all websites. I have all separate robots.txt file of all individual website. But only one robots.txt file is showing in browser for all websites.
Please explain why did you include
Disallow: /refer/
in the beginner Robots.txt example? I do not understand the implications of this line. Is this important for the beginner? You have explained the other two Disallowed ones.
Thanks.
Hi Debu,
This example was from WPBeginner’s robots.txt file. At WPBeginner we use ThirstyAffiliates to manage affiliate links and cloak URLs. Those URLs have /refer/ in them, that’s why we block them in our robots.txt file.
How can I put all tags/mydomain.Com in nofollow? In robots.txt to concentrate the link Juice? Thanks.
hey,,i am getting error in yoast seo regarding site map..once i click on fix it ,,,it’s coming again..my site html is not loading properly
I’ve just been reviewing my Google Webmaster Tools account and using the Search Console, I’ve found the following:
Page partially loaded
Not all page resources could be loaded. This can affect how Google sees and understands your page. Fix availability problems for any resources that can affect how Google understands your page.
This is because all CSS stylesheets associated with Plugins are disallowed by the default robots.txt.
I understand good reasons why I shouldn’t just make this allowable, but what would be an alternative as I would suspect the Google algorithms are marking down the site for not seeing these.
Hi,
Whenever, I search my site on the google this text appears below the link: “A description for this result is not available because of this site’s robots.txt”
How, can i solve this issue?
Regards
Hi Suren,
Seems like someone accidentally changed your site’s privacy settings. Go to Settings » Reading page and scroll down to ‘Search engine visibility’ section. Make sure that the box next to is unchecked.
Hello
As i seen in webmaster tool, i got robot.txt file like below :
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
let me know is that okey ? or should i use any other ?
I want to know, does it a good idea to block (disallow) “/wp-content/plugins/” in robots.tx? Every time i remove a plugin it shows 404 error in some pages of that plugin.
I loved this explanation. As a beginner I was very confused about robot.txt file and its uses. But now I know what is its purpose.
in some robot.txt file index.php has been disallowed. Can you explain why ? is it a good practice.
Thanks for passing by this precious info.
Can you please tell me why this happening on webmaster tool:
Network unreachable: robots.txt unreachableWe were unable to crawl your Sitemap because we found a robots.txt file at the root of your site but were unable to download it. Please ensure that it is accessible or remove it completely.
robots.txt file exist but still
Interesting update from the Yoast team on this at
Quote: “The old best practices of having a robots.txt that blocks access to your wp-includes directory and your plugins directory are no longer valid.”
Allow: /wp-content/uploads/
Shouldn’t this be?
Disallow: /wp-content/uploads/
Because you are aware that google will index all your uploads pages as public URLs right? And then you will get slapped with errors for the page itself. Is there something I am missing here?
Overall, its the actual pages that google crawls to generate image maps, NOT the uploads folders. Then you would have a problem of all the smaller image sizes, and other images that are for UI will also get indexed.
This seems to be the best option:
Disallow: /wp-content/uploads/
If i’m incorrect, please explain this to me so I can understand your angle here.
My blog url not indexing do i need to change my robots.txt?
Im using this robots.txt
how to create robot txt which is ONLY allow index for page and Post.. thanks
I am not sure what’s the problem but my robots.txt has two versions.
One at http://www.example.com/robots.txt and second at example.com/robots.txt
Anybody, please help! Let me know what can be the possible cause and how to correct it?
Most likely, your web host allows your site to be accessed with both www and non-www urls. Try changing robots.txt using an FTP client. Then examine it from both URLs if you can see your changes on both URLs then this means its the same file.
Thanks for the quick reply. I have already done that, but I am not able to see any change. Is there any other way to resolve it?
Yoasts blogpost about this topic was right above yours in my search so of course I checked them both. They are contradicting each other a little bit.. For example yoast said that disallowing plugin directories and others, might hinder the Google crawlers when fetching your site since plugins may output css or js. Also mentioned (and from my own experience), yoast doesn’t add anything sitemap related to the robots.txt, rather generates it so that you can add it to your search console. Here is the link to his post, maybe you can re-check because it is very hard to choose whose word to take for it
As I’m Not Good in Creating this Robotstxt file So Can I use your Robots.txt file by changing the parameters like url and sitemap of my site is it good? or should I create a different one
Hi,
Today i got this mail from Google “Googlebot cannot access CSS and JS files”…what can be the solution?
Thanks
Let me guess… You are using CDN services to import CSS and JS files.
or
It may be possible that you have written wrong syntax in these file.
I have a question about adding Sitemaps. How can I add Yahoo and Bing Sitemap to Robots file and WordPress Directory?
Thanks for the elaborate outline of using the robots file. Does anyone know if Yahoo is using this robots.txt too, and does it obey the rules mentioned in the file? I ask this since I have a “Disallow” for a certain page in my file, but I do receive traffic coming from Yahoo on that page. Nothing from Google, as it should be. Thanks in advance.
correction…
“If you are using Yoast’s WordPress SEO plugin or some other plugin to generate your XML sitemap, then your plugin will try to automatically add your sitemap related lines into robots.txt file.”
Not true. WordPress SEO doesn’t add the sitemap to the robots.txt
“I’ve always felt linking to your XML sitemap from your robots.txt is a bit nonsense. You should be adding them manually to your Google and Bing Webmaster Tools and make sure you look at their feedback about your XML sitemap. This is the reason our WordPress SEO plugin doesn’t add it to your robots.txt.”
https://yoast.com/wordpress-robots-txt-example/
Also more recommended is not to disallow the wp-plugins directory (reasons see Yoast’s post)
And personally I like to simply remover the readme.txt file…
I understood it robots.txt file and use of robots file. What is the site map how do I create sitemap for my site.
After reading Google’s documentation I’m under the impression that the directive to use in the robots.txt file is disallow which only tells the bots what they can and cannot crawl. It does not tell them what can and cannot be indexed. You need to use the noindex robots meta tag to have a page noindexed.
really good article for seo optimized robots.txt file. But I need you to give a tutorial on how to upload robots.txt file to server. As, being a beginner it seems to be a drastic problem to upload that file.
By the way thanks to share such beneficial information.
-Nitin
Upload it to your server/public_hmtl/(Your-site-name) … in this folder
What is the best way to add code to HTTacess to block multiple spam bot refers for their url and Ip address if no URL is given
I know if you get wrong syntax when doing httacess it can take your site off line I am a newbie and need to block these annoying multiple urls from Russia, China, Ukraine etc.
Many thanks