In SEO, understanding canonicalization and being able to set canonical URLs correctly is essential. But what is a canonical URL, why is it important, and how do you create them? This article gives you the lowdown on all that, and more.
When websites get bigger, it’s almost inevitable that there will be duplicate or very similar content that is available via multiple URLs. When multiple URLs can rank for a certain keyword it’s difficult for search engines to know which URL to send traffic to. This conundrum led to the creation of canonical tags.
Canonical tags were originally introduced in February 2009 as a collaborative effort (one of the very few!) from Google, Bing and Yahoo! to help webmasters deal with issues related to duplicate content. They allow webmasters to decide on a preferred URL, which is what we call a canonical URL.
While many people use the terms “canonical tag” and “canonical URL” synonymously, this is incorrect.
Canonical tags are the most commonly used way of setting a preferred URL, but they are not the only way. According to Google, a canonical URL is simply “the URL of the page that Google believes best represents a group of duplicate pages on your site”.
It is useful to note that not all websites need canonical URLs as much as others. Smaller websites with simple content and pages that fit in multiple categories are less likely to need them than large and complex websites.
Setting Preferred Pages
If you have two identical pages under different URLs and neither is set as the canonical URL, then search engines will simply decide for themselves.
Let’s say you have a post or product that can be found under two different categories and as such, exists in two different URLs, for example:
URL 1: https://exampleshop.com/shirts/red-long-sleeve/
URL 2: https://exampleshop.com/long-sleeves/red-long-sleeve/
By choosing one of these links as a canonical URL, you are telling the search engine which one you want it to show in search results, overriding the automatic choice it may make.
Canonicals are also capable of directing search engines to the original version of an article. This is useful if you’ve written an article for another website but would like to post a copy on yours as well. You could agree to post the article on your own website, but with a canonical to the original, on theirs.
Easier Tracking
Canonicals organise multiple URLs under one master URL, making it easier to track the metrics of a specific product or topic. This is especially useful when you need to report on performance to your client.
Less Competition For Ranking
Canonical URLs consolidate information for individual URLs into one master URL. Syndicated content that has been posted on other domains is also consolidated into the page ranking of your preferred URL.
There are 5 known ways of setting canonical URLs, known as canonicalization signals.
Using The rel=canonical HTML Tag
This is the simplest way to specify a canonical URL. Canonical tags are set in the <head></head> section of the HTML source code of a page. This is how they look:
<link rel=”canonical” href=”https://exampleshop.com/product-page/” />
Canonical tags can be self-referencing (pointing to the pages own URL), or they can reference another page’s URL (setting the referenced page as the canonical URL).
If you’re using a CMS like WordPress, there is an easier way to do this without messing around with code. In WordPress, all you need to do is install the Yoast SEO plugin, which adds self-referencing canonical tags automatically. You can set custom canonicals via the “advanced” tab on a post or page.
Setting Canonicals In HTTP Headers
Documents like PDFs don’t have a place to put canonical tags because they don’t have a page <head> section. In these situations, you’ll need to use HTTP headers to set canonicals. This can also be done for normal web pages.
Example:
You are hosting a PDF document in your blog subfolders (typeamedia.net/blog/*). This is how your HTTP header might look:
HTTP/1.1 200 OK
Content-Type: application/pdf
Link: <https://typeamedia.net/blog/cananonical-tags/>; rel=”canonical”
Using Sitemaps
According to Google, non-canonical pages shouldn’t be included in sitemaps, only canonical ones. This is because Google sees pages listed in the sitemap as canonicals.
In their words:
“We don’t guarantee that we’ll consider the sitemap URLs to be canonical, but it is a simple way of defining canonicals for a large site, and sitemaps are a useful way to tell Google which pages you consider most important on your site.”
Using 301 Redirects
You can use 301 redirects to divert traffic away from duplicate content. So if a page is available at multiple URLs, you should choose one to be the canonical URL and use a 301 redirect on each other page to redirect to the canonical one. You can do the same for HTTP/HTTPS and www/non-www versions of your website.
Internal Linking
The links within a website linking one page to another is also a canonicalization signal.
Canonicals are really quite simple to implement, but there are 5 golden rules that you should always follow.
Use Absolute URLs
According to Google’s John Meuller, it’s best not to use relative paths in the rel=“canonical” link element.
In other words, you should use:
<link rel=“canonical” href=“https://example.com/sample-page/” />
Instead of:
<link rel=“canonical” href=”/sample-page/” />
Always Use Lowercase
Google sometimes sees uppercase and lowercase URLs as two different URLs. So, make sure to force lowercase URLs on your server, and then only use lowercase URLs in canonical tags.
Use The Correct Domain Version
If you’ve switched to SSL, you need to make sure not to declare any non-SSL URLs in your canonical tags. If you aren’t using HTTPS, then the opposite applies.
Always Use Self Referential Canonical Tags
While it isn’t mandatory, it’s recommended that you use self-referential canonical tags to make it absolutely clear which page is your preferred URL. John Meuller says “I recommend using a self-referential canonical because it really makes it clear to us which page you want to have indexed, or what the URL should be when it is indexed.”
One Canonical Tag Per Page
Google will simply ignore canonical tags if there are more than one on a page, so stick to one.
Now that you understand what a canonical URL is and how to implement them, you should be on your way to optimising your site for better rankings. Canonical URLs are a simple but essential tool in an SEOs toolbox and they shouldn’t be neglected.