Noindex: How the Directive Really Works in 2026

Key takeaways

Noindex controls indexation, not crawling: the page is still fetched and its links are still discovered, it simply never appears in results.
Never combine noindex with a robots.txt disallow on the same URL: if the crawler is blocked, it can never read the noindex and the page can stay indexed via external links.
Google treats a long-lived noindex page as effectively nofollow over time (Google Search Central, 2017 hangout), so a noindexed page is a slow PageRank sink, not a clean pass-through.
Before any link placement, verify the host page returns no noindex in either the meta robots tag or the X-Robots-Tag header: an indexable host is non-negotiable.
For PDFs, images and other non-HTML files there is no meta tag, the X-Robots-Tag HTTP header is the only reliable way to noindex.
The « Excluded by noindex tag » report in Search Console is a status, not always a bug: confirm intent before « fixing » it.

3 questions to test your knowledge

Read first — the quiz is waiting at the bottom.

What noindex actually does, beyond the tag

A noindex directive tells a search engine one thing: you may crawl this page, but do not keep it in your index. That single distinction is where most of the confusion lives. Noindex is not an access control and it is not a crawl block. The page still gets fetched, its outbound links still get discovered, it simply never shows up for a query. In practice, on a mature site, the pages we deliberately push out of the index, internal search results, faceted filter combinations, paginated noise, thin tag archives, thank-you and confirmation pages, often outnumber the ones we actually want ranking.

The instruction lives in one of two places, and both carry identical weight. Either a robots meta tag in the HTML head, <meta name="robots" content="noindex">, or an X-Robots-Tag in the HTTP response header. Google documents the two as equivalent signals. The header version is the one that matters for anything that is not HTML, which is a detail we will come back to because it is the part that quietly breaks PDF and image strategies.

Here is a clear visual primer on the basics before we get into the operational mechanics.

How noindex works in 2026

The mechanics are stable and well documented, the mistakes are not. When Googlebot fetches a page, it reads the robots meta tag and the X-Robots-Tag header during rendering. If either says noindex, the URL is dropped from the index on the next processing cycle, or never added if it was new. The keyword is fetched: the crawler has to actually reach and read the page for the directive to register. This is the mechanism that makes the robots.txt approach a trap.

Stuffing a noindex line into robots.txt does nothing useful. Google publicly dropped support for unofficial robots.txt directives, including noindex, in September 2019 (Google Search Central, 2019). Worse, if you block a URL with a robots.txt disallow and also place a noindex meta tag on it, the disallow wins the race: the crawler never reaches the page, never reads the noindex, and the URL can remain indexed indefinitely on the strength of external links pointing at it, showing up as a bare URL with no snippet. If your real intent is to keep a page out of the index, you must let it be crawled and serve the noindex. The two directives solve different problems and you can read more on the crawl-control side in our entry on instructing crawlers through robots.txt.

For non-HTML assets there is no head to put a meta tag in. A PDF, a generated CSV, a raw image: the only reliable way to keep these out of the index is the X-Robots-Tag header, set at the server or CDN level. We see teams forget this constantly, then wonder why a gated whitepaper PDF is ranking for branded queries.

Where noindex matters in a netlinking operation

For a link buyer, noindex is not abstract hygiene, it is a direct threat to the value of a placement. A noindexed page does not rank, so a backlink sitting on a noindexed host is, in the best case, decorative. But the deeper issue is PageRank flow. A noindex page still receives and consumes link equity through its internal and external links. Over time, Google has stated it treats a persistently noindexed page as effectively nofollow (Google Search Central hangout, 2017), which means the equity flowing into it eventually stops flowing back out. A noindexed page that accumulates a lot of internal links becomes a slow drain on the rest of the site.

This is exactly the audit we run before any placement on an owned property. As a network of owned French media operated in-house, Stringer verifies that the host page, the category it sits in, and the article template all return an indexable status before a single link goes live. It is also why we publish the catalogue of media you can browse without an account, so a buyer can check the context of a placement up front rather than discover a noindexed host after the fact. When you place a sponsored article on a page you have actually verified is indexable, you are buying a link that can pass equity, not a screenshot for a report.

The flip side is just as operational. On your own client sites, noindexing the right pages concentrates crawl budget and ranking signals on the URLs that earn revenue. A large e-commerce site that lets every filter permutation into the index dilutes its own authority across thousands of near-duplicate URLs. Pairing a clean indexation strategy with a correct canonical tag setup on parameterized pages is one of the highest-leverage technical wins on most large sites.

What we see go wrong in audits

The single most common incident is the staging noindex that ships to production. A team builds the site behind a site-wide noindex to keep the dev environment out of search, then forgets to strip it at launch. Traffic flatlines, nobody understands why, and the cause is one line in the head of every page. This is preventable with a deploy-time check that fails the build if a global noindex is present on the production target.

The second is the noindex plus disallow combination described above, usually applied with good intentions and the opposite of the desired effect. If you want a page out of the index, allow crawling and serve noindex. If you want to save crawl budget on a section that is already deindexed, then add the disallow only after the deindexation has been confirmed in Search Console.

Third, we regularly find sites that noindex pages they actually want to keep, then puzzle over lost rankings: paginated series where page two onward is noindexed and the products only listed there vanish, or author and date archives killed wholesale on a content site that relied on them for internal linking depth. Noindex is surgical, not a broom. Decide page type by page type, not by blanket rule.

Finally, the subtle one: noindexing a page that holds inbound external links. The equity those links carry is now trapped behind a directive that will, over time, stop passing it onward. If a URL has earned real backlinks, a 301 redirect to a relevant indexable page almost always beats a noindex.

Fixing the « Excluded by noindex tag » error

When Search Console reports a URL as « Excluded by noindex tag », the first question is whether that is a bug at all. Half the URLs in that report are supposed to be there. Confirm intent before you touch anything. If the page genuinely should rank, the fix is a short diagnostic chain rather than a guess.

Start with the URL Inspection tool in Search Console: it shows you exactly what Googlebot saw and whether the noindex came from the page. Google addressed the removal step directly in this office-hours clip.

Next, locate the source of the directive. It is almost always one of three places: a hardcoded meta robots tag in the template, an X-Robots-Tag set by the server or CDN that never appears in the page source, or a CMS setting. On WordPress this is usually the « Search engine visibility » checkbox in Settings, or a per-page toggle in Yoast or Rank Math. Check the HTTP headers, not just the HTML, because a header-level noindex is invisible in view-source and burns hours if you only look at the markup. Then confirm robots.txt is not blocking the re-crawl, otherwise Google cannot see that you removed the directive. Once the noindex is gone, request indexing to speed up reprocessing. This walkthrough covers the exact resolution flow inside Search Console.

Tactical takeaways for a working SEO

Audit indexability the way you audit links: continuously, not once. A site crawler such as Screaming Frog, Sitebulb or Lumar surfaces every noindex in seconds across meta tags and headers, and a scheduled crawl catches the staging directive before it costs you a quarter. Cross-reference the crawl against the « Pages » report in Search Console so you can separate intentional exclusions from accidents.

For a netlinking workflow specifically, build host-page indexability into your pre-placement checklist alongside topical relevance and traffic. A link on an indexable, equity-passing page is worth more than three on noindexed hosts, and verifying it costs one HTTP request. Keep the surgical mindset: noindex the pages that exist for users but not for search, redirect the ones that earned links, and leave the rest in the index where they can work for you.

Frequently asked questions

What is the practical difference between noindex and nofollow?

Noindex controls indexation: keep this page out of search results. Nofollow is a link-level attribute that tells search engines not to pass equity through a specific link. They operate at different scopes. A page can be indexed but full of nofollowed links, or noindexed while its links are still followed, at least until Google downgrades a long-lived noindex page to effectively nofollow. Confusing the two is one of the most common technical SEO errors we see in audits.

Why is putting noindex in robots.txt a bad idea?

Google stopped supporting noindex as a robots.txt directive in September 2019, so it is simply ignored. Worse, robots.txt is a crawl control, not an index control. If you disallow a URL there, Googlebot never reaches the page and can never read a noindex meta tag you placed on it, which means the URL can stay indexed via external links. To deindex, allow crawling and serve a real noindex directive.

Does a noindexed page still pass PageRank through its links?

Initially yes, the links are still followed. But Google has said it treats a persistently noindexed page as effectively nofollow over time (Search Central, 2017). So a long-lived noindex page becomes a slow equity sink: link value flows in and eventually stops flowing back out. If a URL has earned real backlinks, a 301 redirect to a relevant indexable page usually preserves more value than a noindex.

How do I check whether a page is actually noindexed?

Three angles. View the HTML source and look for a robots meta tag with noindex. Check the HTTP response headers for an X-Robots-Tag, which never appears in the source. And use the URL Inspection tool in Search Console, which reports exactly what Googlebot saw and where the directive came from. A crawler like Screaming Frog or Lumar surfaces all three at scale across the whole site.

Should I noindex thin or duplicate content, or canonicalize it?

It depends on intent. Use a canonical tag when several near-identical URLs should consolidate into one indexable winner, for example product variants or tracking-parameter versions. Use noindex when a page genuinely should not appear in results at all, like internal search results or thank-you pages. Canonical is a consolidation hint Google may ignore, noindex is a directive it obeys. Picking the wrong one either splits or buries the wrong URL.

Is « Excluded by noindex tag » in Search Console always a problem?

No. Half the URLs in that report are usually there on purpose: filters, internal search, utility pages. Treat it as a status, not an alarm. The fix is only needed when a page you want to rank shows up there. Confirm intent first, then trace the directive to its source, meta tag, HTTP header or CMS plugin, before removing it and requesting reindexing.

Quiz

Test your knowledge

Quiz: Noindex

1/3

You add a noindex meta tag to a page and also disallow it in robots.txt. What happens?

Benoit Demonchaux

Founder and operator of Stringer Network. Edits and writes the site's editorial glossary, as well as the content published across the Stringer network of editorial media.

Related glossary terms

Canonical tag

Canonical tag in 2026 is a hint Google can override, not a directive.

hreflang

Hreflang is a hint, not a directive.

Netlinking

The activity of acquiring inbound links from other websites.

Backlink

A hyperlink placed on another website that points to yours.