Q&A: Will not having a robots.txt file prevent Google from crawling my blog?

Question from Gary: In addition to your awesome blog I also read the one that Kim Komando writes.

I recently read an article in which she discussed several things that can indicate that a website is outdated.

Then she said something that both puzzles and concerns me.

The first tip in that article said not having a robots.txt file will prevent Google from including your site in their search results.

Here’s the quote from that article (I bolded the last sentence to highlight it):

“Have you heard of a robots.txt file? This file tells search engine crawlers which pages or files the crawler can or can’t request from your site. In layman’s terms: This is what helps search engines find and catalog your website. Without it, your site won’t turn up when your potential customers do a search.“

Is this something new? I’ve been blogging for years and I’ve never had a robot’s.txt file on my blog and I get lots of visitors from Google.

If this is a new policy I’ll create one ASAP because I can’t afford to lose my Google traffic.

Rick’s answer: Gary, I’ve been a huge fan of Kim’s for years and her advice is almost always spot on.

Although that quoted passage definitely makes it sound that way, I don’t believe she intended to say not having a robots.txt file will prevent your blog’s pages from showing up in web searches.

I could be wrong, but I believe what she intended to say was not having your pages included in Google’s index will prevent them from showing up in users’ Google searches.

And if that’s what she actually meant, she’s exactly right.

My best guess is she simply misspoke (or in this case, mistyped).

If, on the other hand, she did intend to say not having a robots.txt file will prevent your blog’s pages from showing up in Google searches I have to say that’s one of the very rare occasions in which she badly missed the mark (as we all do on occasion).

Read on for a full explanation…

Having a robots.txt file in your blog’s root directory is a good idea because, among other benefits, it allows you to instruct Googlebot and other search engine crawlers not to crawl and index certain directories and files of your blog.

But that being said, Google and the other search engines can and do crawl and index websites that don’t have a robots.txt file in place.

The primary purpose of a robots.txt file is to tell the search engine bots which directories and files NOT to crawl and index.

While you can include a line in the file to tell the bots where to find your site’s XML Sitemap (a special file that helps the bots find all the pages on your site), the bots are generally quite good at finding your blog’s pages on their own, especially if your blog uses a good linking structure.

The fact that your blog is getting lots of traffic from Google indicates that its linking structure is indeed adequate for directing the search bots to all the pages.

Of course everything I said above is my own personal opinion based upon what I’ve learned in the years since I first began blogging, but plenty of experts agree with it.

For example, this is what Backlinko’s Brian Dean (widely recognized as an authority on all things related to backlinks and Search Engine Optimization (SEO) has to say about it:

“Most websites don’t need a robots.txt file.

That’s because Google can usually find and index all of the important pages on your site.

And they’ll automatically NOT index pages that aren’t important or duplicate versions of other pages.“

Brian then goes on to discuss three ways having a robots.txt can help you, but only after making it clear that having one isn’t necessary for most sites.

Next, the following is a quote from this tutorial by Ahrefs, one of the most trusted and most widely used tools in the world for webmasters, bloggers and Internet marketers:

“Do you need a robots.txt file?

Having a robots.txt file isn’t crucial for a lot of websites, especially small ones.

That said, there’s no good reason not to have one. It gives you more control over where search engines can and can’t go on your website…“

And finally, we have the ultimate authority on the topic: Google.

This is how Google answers the very first question on their Robots FAQs page:

Question…

“Does my website need a robots.txt file?“

Answer…

“No. When Googlebot visits a website, we first ask for permission to crawl by attempting to retrieve the robots.txt file. A website without a robots.txt file, robots meta tags, or X-Robots-Tag HTTP headers will generally be crawled and indexed normally.“

In my humble opinion, that’s about as definitive as you can get.

Now that we have the answer to your question out of the way, let me reiterate what I said earlier…

Neither Google nor any other search engine that I’m aware of requires you to have a robots.txt file on your blog in order for their bots to successfully crawl and index its pages.

And truth be told, your own positive experience with your blog’s Google traffic is proof of that.

But that being said, it’s still a good idea to have a robots.txt file on your blog as explained in detail in this article from MOZ, one of the world’s most trusted authorities on all things SEO.

I strongly recommend that you take a few moments to read that entire article, but here are the key takeaways if you’re pressed for time:

“Some common use cases [of a robots.txt file] include:

— Preventing duplicate content from appearing in SERPs (note that meta robots is often a better choice for this)

— Keeping entire sections of a website private (for instance, your engineering team’s staging site)

— Keeping internal search results pages from showing up on a public SERP
Specifying the location of sitemap(s)

— Preventing search engines from indexing certain files on your website (images, PDFs, etc.)

— Specifying a crawl delay in order to prevent your servers from being overloaded when crawlers load multiple pieces of content at once“

Those are some pretty good reasons to have a robots.txt file, especially since they are so easy to create.

If you decide to create a robots.txt file for your blog (and I really believe you should) you’ll find step-by-step instructions in this tutorial from Google.

Bottom line: As your own experience with Google proves, you DO NOT have to have a robots.txt file on your blog in order for Google and the other search engines to successfully crawl your blog’s pages and index them (and send your blog traffic as a result).

However, just because one isn’t required doesn’t mean your blog shouldn’t have one. I really believe it should.

I hope this helps, Gary. Good luck!