Whatever You Need To Know About The X-Robots-Tag HTTP Header

Posted by

Search engine optimization, in its most standard sense, trusts something above all others: Online search engine spiders crawling and indexing your site.

However nearly every site is going to have pages that you don’t want to consist of in this expedition.

For example, do you really desire your privacy policy or internal search pages showing up in Google results?

In a best-case scenario, these are not doing anything to drive traffic to your website actively, and in a worst-case, they could be diverting traffic from more crucial pages.

Luckily, Google permits web designers to inform online search engine bots what pages and material to crawl and what to ignore. There are numerous ways to do this, the most common being using a robots.txt file or the meta robotics tag.

We have an outstanding and detailed description of the ins and outs of robots.txt, which you should absolutely check out.

But in high-level terms, it’s a plain text file that resides in your website’s root and follows the Robots Exemption Protocol (REP).

Robots.txt offers spiders with guidelines about the site as an entire, while meta robotics tags include instructions for particular pages.

Some meta robots tags you may use include index, which tells search engines to add the page to their index; noindex, which informs it not to include a page to the index or include it in search engine result; follow, which advises a search engine to follow the links on a page; nofollow, which tells it not to follow links, and a whole host of others.

Both robots.txt and meta robots tags are useful tools to keep in your toolbox, but there’s likewise another way to instruct search engine bots to noindex or nofollow: the X-Robots-Tag.

What Is The X-Robots-Tag?

The X-Robots-Tag is another way for you to manage how your webpages are crawled and indexed by spiders. As part of the HTTP header reaction to a URL, it manages indexing for a whole page, in addition to the particular components on that page.

And whereas utilizing meta robotics tags is fairly simple, the X-Robots-Tag is a bit more complicated.

However this, naturally, raises the question:

When Should You Use The X-Robots-Tag?

According to Google, “Any instruction that can be used in a robots meta tag can also be defined as an X-Robots-Tag.”

While you can set robots.txt-related instructions in the headers of an HTTP reaction with both the meta robots tag and X-Robots Tag, there are particular situations where you would wish to utilize the X-Robots-Tag– the 2 most common being when:

  • You want to manage how your non-HTML files are being crawled and indexed.
  • You wish to serve regulations site-wide rather of on a page level.

For example, if you want to obstruct a particular image or video from being crawled– the HTTP action technique makes this easy.

The X-Robots-Tag header is likewise helpful due to the fact that it allows you to combine multiple tags within an HTTP action or utilize a comma-separated list of instructions to define directives.

Maybe you don’t desire a particular page to be cached and want it to be not available after a specific date. You can use a combination of “noarchive” and “unavailable_after” tags to instruct online search engine bots to follow these directions.

Basically, the power of the X-Robots-Tag is that it is far more flexible than the meta robotics tag.

The advantage of utilizing an X-Robots-Tag with HTTP reactions is that it enables you to utilize regular expressions to perform crawl instructions on non-HTML, in addition to apply parameters on a larger, global level.

To help you comprehend the distinction in between these instructions, it’s handy to classify them by type. That is, are they crawler directives or indexer instructions?

Here’s an useful cheat sheet to discuss:

Spider Directives Indexer Directives
Robots.txt– utilizes the user representative, enable, prohibit, and sitemap instructions to specify where on-site search engine bots are enabled to crawl and not allowed to crawl. Meta Robots tag– permits you to define and avoid search engines from showing specific pages on a site in search results page.

Nofollow– allows you to define links that need to not hand down authority or PageRank.

X-Robots-tag– enables you to manage how defined file types are indexed.

Where Do You Put The X-Robots-Tag?

Let’s say you want to obstruct particular file types. An ideal technique would be to include the X-Robots-Tag to an Apache configuration or a.htaccess file.

The X-Robots-Tag can be added to a site’s HTTP responses in an Apache server configuration via.htaccess file.

Real-World Examples And Uses Of The X-Robots-Tag

So that sounds fantastic in theory, but what does it look like in the real life? Let’s have a look.

Let’s state we desired online search engine not to index.pdf file types. This setup on Apache servers would look something like the below:

Header set X-Robots-Tag “noindex, nofollow”

In Nginx, it would appear like the listed below:

area ~ *. pdf$ add_header X-Robots-Tag “noindex, nofollow”;

Now, let’s look at a different scenario. Let’s state we want to utilize the X-Robots-Tag to block image files, such as.jpg,. gif,. png, and so on, from being indexed. You might do this with an X-Robots-Tag that would appear like the below:

Header set X-Robots-Tag “noindex”

Please note that understanding how these instructions work and the impact they have on one another is vital.

For example, what takes place if both the X-Robots-Tag and a meta robotics tag lie when crawler bots find a URL?

If that URL is obstructed from robots.txt, then certain indexing and serving regulations can not be found and will not be followed.

If directives are to be followed, then the URLs including those can not be prohibited from crawling.

Check For An X-Robots-Tag

There are a couple of different approaches that can be utilized to look for an X-Robots-Tag on the website.

The most convenient way to examine is to set up a browser extension that will inform you X-Robots-Tag details about the URL.

Screenshot of Robots Exemption Checker, December 2022

Another plugin you can use to identify whether an X-Robots-Tag is being utilized, for example, is the Web Designer plugin.

By clicking the plugin in your internet browser and navigating to “View Action Headers,” you can see the different HTTP headers being utilized.

Another method that can be utilized for scaling in order to pinpoint issues on websites with a million pages is Shrieking Frog

. After running a website through Yelling Frog, you can navigate to the “X-Robots-Tag” column.

This will show you which sections of the site are utilizing the tag, in addition to which particular directives.

Screenshot of Shouting Frog Report. X-Robot-Tag, December 2022 Utilizing X-Robots-Tags On Your Site Comprehending and managing how online search engine connect with your site is

the cornerstone of search engine optimization. And the X-Robots-Tag is a powerful tool you can use to do simply that. Simply be aware: It’s not without its risks. It is extremely simple to make a mistake

and deindex your whole site. That stated, if you’re reading this piece, you’re probably not an SEO beginner.

So long as you utilize it sensibly, take your time and inspect your work, you’ll discover the X-Robots-Tag to be a helpful addition to your arsenal. More Resources: Included Image: Song_about_summer/ SMM Panel