What is noindex?

A noindex tag informs search engines not to index a page or website, excluding it from appearing in search engine results.

What Does a Noindex Tag Do, and When Should You Use It?

While much of search engine optimization (SEO) focuses on how to get your quality content indexed and seen, there are pages that you may not wish to show up in the search engine results page (SERPs). Using the noindex tag tells search engine web crawlers to not include a page or website in its index, effectively removing it from appearing in the SERP. 

What Is Noindex?

A noindex robots meta tag is an HTML value that tells search engines not to include a page in the index of search results. Utilizing a noindex tag helps to define valuable and curated content from other pages used to enhance the user experience. This is a helpful tool for pages that you do not necessarily wish to be available in a SERP.

When Should I Noindex a Page?

Inevitably, there may be some pages that are not helpful to your rankings or are not intended for the general eyes of the public. For example:

  • Pages with duplicate or similar content, the same URL, title, title tags, or body content may be in competition with each other for visibility. Crawlers will likely select only one version, and it may not be the version of your choice.
  • Pages that have thin content or are not designed to appear in the search results. These types of pages often provide reader or customer satisfaction, but are not built for SEO performance, or to appear as a result of a query. These types of pages may include thank you pages, newsletters, or subscription and login pages. 

Pages that include posts by users such as forums. Forums may contain answers, comments, and threads without any authority.

Spotting Noindex Errors

The time it may take to notice a noindex error that occurs from misuse, or accidentally noindexing a page or entire site may vary. A decline in organic traffic by being removed from the index could have immediate and drastic results, or it could take months.

It is important to closely monitor for major changes or declines in organic traffic. This can be done through auditing content performance, Google Analytics, and Google Search Console.Recovering Pages Noindexed Errors

Noindex errors can have severely negative consequences. Recovering from a noindex error can be done through the completion of a few short steps:

  • Remove the noindex error.
  • Resubmit your sitemap to Google through the Search Console.
  • Manually submit the page(s) affected by the error to the “Submit to Index” feature.
  • Consider additional marketing tactics, use of social media, or guest posts to try and increase traffic while the site or page is not appearing in the index.
Want to learn more?
Visit our blog to learn more about search and search engine optimization.
TO THE BLOG
SEO Keyword Research

SEO Keyword Research

Read our comprehensive SEO keyword research guide to learn how you can get your web pages to show up higher in the SERPs.

 Link Building Guide

Link Building Guide

Check out our ultimate link building guide to learn how to earn powerful backlinks to empower your web content in search.

How to Noindex a Page 

To noindex a page, add an HTML meta tag in the header section of the page, or in a robots.txt file. The header section of code on any given page describes and lists information about the webpage’s properties. This may include the title, meta tags, links to external files, and code. 

A general header with a noindex tag will look like:

<html>

<head>

<meta name= “robots” content = “noindex”>

<title>Don’t index this</title

</head>

The directive can be restricted so that it targets only specific bots by changing the value of the “name” in the meta tag. For example, to block Google’s bots, the code would look like this:

<meta name=”googlebot” content=”noindex>

The noindex tag can also be used as an element of the HTTP header response for any given URL, by utilizing the X-Robots-Tag:

HTTP/1.1 200 OK

Date: Wed, 12 Feb 2020 13:26:32 GMT

(...)

X-Robots-Tag: noindex

(...)

A domain may also include a noindex tag in its robot.txt file. A robot.txt file helps bots and search engines understand the structure and content of a website. This element of technical SEO doesn’t usually have a user experience impact but is like a guide or map for a bot crawling a site. The robot.txt file can typically be located by navigating to:

https:/examplesite.com/robots.txt

An example of noindex in a robots.txt file may look like this: 

User-agent: *

Noindex: /example-folder/example-page.html

Noindex: /forum/

When used in robots.txt file, the noindex directive can command crawlers not to index a single page, as seen in the first line above after the user-agent is defined. It can also be used to command crawler not to index an entire section of a website, such as all of the pages generated on a website’s forum.

What Is a Disallow Directive, and Why Is It Different From Noindex?

While a noindex tag tells a bot or crawler not to add a page to the index of the search results, a disallow directive tells search engines not to crawl the page at all. This must be done through the robots.txt file and is sometimes used in tandem with noindex. 

While the disallow tag is a helpful tool, it is important to be extremely cautious when using a disallow directive. By disallowing a page, you are essentially removing it from your site in regards to search, and you are also removing its ability to pass PageRank — the value given to a webpage by a search engine that allows it to appear in the SERPs. Accidentally disallowing the wrong page — a page that drives traffic to your site, for example — can have disastrous effects on traffic and your SEO tactics. 

Why Should I Disallow a Page?

Disallowing pages that have no reader or SEO value use can make your site quicker for bots to crawl and index. An example would be the search function on an e-commerce site. While the search function provides value to the user, the various pages it retrieves are not necessarily pages that add SEO value to your site.  

Combining Noindex and Disallow

If there are external links or canonical tags — tags that tell bots which page from a group of similar pages should be indexed — pointing to a page that has been disallowed, it could still be indexed and ranked, even though it cannot be crawled. This means that it could still show up in a SERPs.

To apply both directives, add them both into the robot.txt file. For example:

  • Disallow: /example-folder/example-page.html
  • Noindex: /example-folder/example-page.html 

What Is a Nofollow Meta Tag?

A nofollow tag is used to tell search engines not to evaluate the merit of the links (or a specific link) that exist on a page. Nofollow meta directives also tell bots not to discover more URLs within the site by setting all of the links to “nofollow” — by default all links on a page are set to be followed. You can either add a nofollow tag to individual links, or blanket nofollow them via a robots meta tag in the page’s HTML header. Nofollow links can be used as an SEO tactic to be able to link to pages they wish to provide to the reader, without the bot or crawler associating that page with their own. 

For example, a single nofollowed link might look like:

<a href=”https://example.com/” rel=”nofollow”>

While a nofollow meta tag in the header would look like this:

<meta name=”robots” content=”nofollow”>

When Should I Nofollow Links?

Nofollow tags are useful when applied to links that you may not directly control, such as links in comment sections, inorganic or non-relevant paid links, guest posts, links to something off-topic to the website or page, or an embed such as a widget or infographic. 

What Is Noindex Nofollow?

Adding a nofollow tag to a link won’t prevent the linked page from being crawled or indexed, though it prevents an association or passing of authority between the linked pages. 

To simultaneously command bots not to index a page or follow the links on it, you would simply combine the noindex, nofollow definitions into one meta tag. For example:

<meta name=”robots” content=”noindex, nofollow”>

If you do not wish for Google to crawl the page completely, you will still need to disallow it.