Many websites fail to optimize their duplicate content in an SEO-friendly way. In fact, according to Tomek Rudzki’s research, statuses related to duplicate content are the second most common issue in Google Search Console for websites of all sizes.
A prevailing duplicate content SEO issue is when Google doesn’t agree with users on which page version is the main one. In this case, you might see a “Duplicate, Google chose different canonical than user” status in Google Search Console.
Here’s what Google documentation says about “Duplicate, Google chose different canonical than user”:
This page is marked as canonical for a set of pages, but Google thinks another URL makes a better canonical. Google has indexed the page that we consider canonical rather than this one. We recommend that you explicitly mark this page as a duplicate of the canonical URL. This page was discovered without an explicit crawl request. Inspecting this URL should show the Google-selected canonical URL.source: Google
The consequences of Google indexing different content than you intended vary depending on individual cases. The most severe one is discouraging users from visiting or staying on your page by showing them results that, e.g., are missing essential pieces of information which are present on your preferred version.
This article shows the possible causes and solutions for the “Duplicate, Google chose different canonical than user” status.
Where can you find the “Duplicate, Google chose different canonical than user” status?
You can check the status of your page in the Index Coverage report in Google Search Console.
The Index Coverage report includes four groups of issues:
- Valid with Warnings,
“Duplicate, Google chose different canonical than user” belongs to the Excluded category. Excluded URLs are not indexed, and Google doesn’t think it’s a mistake.
You can see a list of URLs reporting “Duplicate, Google chose different canonical than user” after clicking on the status in the Details section.
The list is available for export, but there is a 1000 URLs limit. However, if you have more than one sitemap, you can download the report for each sitemap separately and increase the number of exported URLs.
How to check which page Google chose as the canonical one?
The “Duplicate, Google chose different canonical than user” status doesn’t show you which page Google chose. All you can see is that it’s a different one than the page you wanted to be indexed.
To see which page Google chose, you need to navigate to the URL Inspection tool.
After entering the URL you want to check, you will see many different pieces of information, including the Coverage status. You can expand this option to see the Google-selected canonical and the User-declared canonical.
Thanks to the URL Inspection API, you can now bulk check up to 2000 URLs per day using the URL Inspection tool and get the information about the Google-selected canonical in a JSON file.
The added API access is very helpful for anyone who struggles with Google choosing a different canonical than the user-selected one. Without the API, it’s extremely time-consuming to check the Google-selected canonical on a large sample of URLs.
How does Google choose the canonical page?
Before I jump into the methods Google uses to choose the canonical page, let me explain why it’s essential for Google to determine which pages are the original ones:
Firstly, Google’s guidelines state that the search engine “tries hard to index and show pages with distinct information.” That’s why after encountering duplicate content, it chooses the canonical one that it identifies as the most useful to its users. Otherwise, the users would see many different results leading to identical content.
Secondly, according to Google’s documentation, “duplicates are crawled less frequently” than the canonical pages. It allows Google to save its resources for crawling more important pages and reduce its crawling load on your server.
Now, let’s see how Google chooses the canonical page.
We try to pick the canonical URL by following two general guidelines: First, which URL does it look like the site want us to use; so, what’s the site’s preference? And secondly, which URL would be more useful for the user?
Some of the signals Google looks at when determining the canonical version include:
- Canonical tags,
- Internal linking structure,
- HTTPS over HTTP protocol,
- Better-looking URL,
These factors are hints you can use to help Google understand which page you want to be indexed. However, the search engine is not obligated to respect them.
<link rel="canonical" href="https://example.com/original-page">
A canonical tag is a piece of HTML code placed in the <head> section. Its href attribute includes a link to the canonical version of a page. If the page in question is a duplicate, non-canonical version of your content, you should place a link to the canonical version in the href attribute.
But you can also add a self-referencing canonical tag. A self-referencing page contains a canonical tag with the href attribute pointing to itself. During Google’s SEO Office Hours, John Mueller recommended using the self-referencing canonical tags, even if there’s only one version of the page.
I recommend doing this self-referential canonical because it really makes it clear to us which page you want to have indexed, or what the URL should be when it is indexed.
Even if you have one page, sometimes there are different variations of the URL that can pull that page up. For example, with parameters in the end, perhaps with upper lower case or www and non-www, and all of these things can be kind of cleaned up with a rel canonical tag.
Sitemaps are simple text files listing URLs that you as a site owner want to be indexed. It serves as a roadmap to search engine bots, allowing them to find valuable URLs quickly, without crawling the whole website first.
Sitemaps should only include canonical URLs. Putting duplicate pages inside a sitemap might waste your crawl budget (the number of URLs Google can and wants to crawl on your website) and confuse search engines.
However, putting a URL inside a sitemap doesn’t guarantee that search engines will index that URL. It’s just a hint helping them understand which pages you care about the most. In our Ultimate Guide to XML Sitemaps, you can learn more about creating and optimizing your sitemap.
The way pages are linked together helps search engines find all valuable pages and determine their importance.
The more valuable the page is, the more links should point to it.
Let’s imagine there are two equally valuable pages. One of them is only linked from the sitemap. The other one is easily found in the navigation and has links pointing to it from other pages on the website. In this case, Google assumes that the page with links is more valuable than the one only found in the sitemap.
Internal linking structure is a part of a more complex issue called website architecture. If you want to learn more about it, I recommend you read our extensive guide on site architecture, which explains in detail what it is and how to design a perfect one for your website.
HTTPS over HTTP
HTTP is a protocol defining data transfer between a server and a client. HTTPS is the encrypted version of the protocol. Thanks to the added layer of security, data transmission is safer, and the risk of data manipulation is smaller.
If you have a page accessible both in HTTP and HTTPS versions, Google will choose to index the HTTPS version.
URLs help both users and search engines see what a page contains. As a website owner, you have control over what your URLs look like. As John Mueller said, if more than one URL leads to the same page, Google might pick “the nicer-looking ones.”
What exactly does a nicer-looking URL mean? Google says that “A site’s URL structure should be as simple as possible.”
Let’s look at the examples of two URLs:
The second URL is definitely “nicer-looking.” This is because it’s shorter and clearly indicates what this page contains. If you’re interested in learning more about URL structure, I recommend reading our article on How To Create an SEO-Friendly URL.
Using the 301 Redirect is one of the ways you can consolidate duplicate content on your site. If a user or a search engine bot accesses a page, it will automatically redirect them to a new one.
You can use it when you want only one version of your page to stay available on your website. For example, if you have a www and non-www version, you can use the 301 redirect to specify which one you should stay available and be indexed.
Causes and solutions for the “Duplicate, Google chose different canonical than user” status
In some cases, choosing a different canonical URL than the user might not bring consequences. If two pages are identical, the one Google chose might rank just as well as the one you chose.
But chances are, you chose a canonical page for a reason. If the pages are not identical, the one Google chose might be missing some essential details, which can discourage users from visiting your website.
So let’s look at possible causes why Google might disagree with you on the canonical version and ways of fixing the problem.
Google might choose a different canonical page than the user for various reasons, including:
- Inconsistent signals,
- Self-referencing canonical tag with no unique content,
- Rendering issues,
- Targeting different countries with the same/similar language.
As mentioned in the “How does Google choose the canonical page?” chapter, there are multiple signals you can use to indicate which page is the original one. However, if you use them inconsistently, it might confuse Google and cause it to pick the wrong URL to index.
Let’s imagine a situation when you have three duplicate pages:
- All of the pages have canonical tags pointing to page A,
- Page B is in the sitemap,
- Page C has the most internal links pointing to it.
In case of conflicting signals, Google needs to guess which one of the pages is the real canonical one.
The clearer you make your signals, the easier they are to trust :). For example, if internal links, sitemaps, hreflang, rel-canonical, etc all align, there's not much to guess. Often it's quite inconsistent & harder to pick.
— 🦝 John (personal) 🦝 (@JohnMu) February 28, 2018
There is one solution to this cause of the “Duplicate, Google chose different canonical than user” status: be consistent!
Here are a few tips to keep in mind while setting up the canonical signals:
- Avoid putting non-canonical pages or pages with redirects in your sitemap,
- Ensure your internal links are consistent and every link points to the canonical version,
- Canonical tags should point to the final version, don’t include a page redirecting to a different page,
- Avoid canonical loops (page A has a canonical tag pointing to page B, and page B has a canonical tag pointing to page A), and canonical chains (page A has a canonical tag pointing to page B, and page B has a canonical tag pointing to page C).
Self-referencing canonical tag with no unique content
If you have multiple pages with self-referencing canonical tags, but Google decides they contain no unique value, it might pick only one page to index.
It usually happens on eCommerce sites when multiple products have the same description.
If you’re selling the same bed model in different sizes, you might want all of the pages with different sizes to be indexed so users can easily find what they are looking for. After all, if they are looking for a king-size bed, and they see only small beds meant for children in search results, they might ignore your page and visit your competition’s website instead.
If someone is searching for a piece of text that is within this duplicated description on your pages, then we would recognize that this piece of text is found on a bunch of pages on your website, and we would try to pick maybe one or two pages from your website to show.
Add unique content to your pages.
Don’t rely only on the self-referencing canonical tags. Instead, ensure that each page has a unique value.
John Mueller addressed the problem of duplicate descriptions during Google’s SEO Office Hours. He stated that you should at least have some additional text information indicating that the products are different.
[…]if you don’t have anything in the textual content at all that covers the visual element of your products, then it makes it very hard for us to actually show these properly in the search results.[…]
So that’s the angle I would take here is it’s fine to have parts of the description duplicated. But I would definitely make sure that you at least have something in there that really has text about the visual elements that are unique to those individual products that you’re selling.
Rendering is essential for Google and other search engines to see and understand our website’s content and layout. Without rendering, your content doesn’t exist online. We are way past the times when you could see your content by simply looking into the website’s HTML code.
Google might think some pages are duplicates because it cannot render the content that makes them unique.
You can check how Google renders your page in the URL Inspection tool in Google Search Console. The tool provides screenshots of your rendered page that allow you to gain insights into how Google sees your page. If your content is missing on the screenshots, it indicates that there might be some problems with rendering.
If your resources are accessible to Google, you’ll need to evaluate the scripts. You should consider aspects like the size of your script and if you need it all to generate the page.
The topic of Rendering SEO is extensive, and if you don’t have coding experience, you might need the help of your developers to resolve some of the more complex issues. For more information, visit our Rendering SEO manifesto, where we explained the topic in detail.
Targeting different countries with the same/similar language
If you have pages targeting specific countries that speak the same or similar language (e.g., USA and UK), it might happen that Google chose only one of them to index.
Suppose the only solution you use to indicate you’re targeting different countries with the same language is a self-referencing canonical tag. In that case, Google might not understand the purpose and think these are all duplicate pages. As a result, it will choose only one of them to index, and your users might find pages dedicated to different countries in their search results.
It might be an especially big problem for eCommerce sites because it might result in customers’ inability to make a purchase.
You should always make sure that you have hreflang tags in place.
A hreflang tag is a piece of HTML code that helps you specify the language and country the page targets.
<link rel="alternate" hreflang="en-gb" href="https://en-gb.example.com/item"> <link rel="alternate" hreflang="en-us" href="https://en-us.example.com/item">
The hreflang tag allows you to specify not only the language (en – English) but also the country (gb – Great Britain, us – United States).
Another thing you can do is to make sure your content is not only translated but also localized. Even if the language is the same, different countries have different cultures. Ensure to adjust your pages for users from a specific country. Not only does this practice provide a better user experience for your customers, but it might also convince Google these pages are unique.
“Duplicate, Google chose different canonical than user” vs. “Duplicate, submitted URL not selected as canonical” vs. “Duplicate without user-selected canonical”
“Duplicate, Google chose different canonical than user” might be easily confused with two different statuses in the Index Coverage report:
- “Duplicate, submitted URL not selected as canonical,” and
- “Duplicate without user-selected canonical.”
These statuses indicate the same thing: the page is not indexed because Google thinks it’s not canonical.
The difference lies in how Google found out about the page and if the user declared a canonical tag or not.
The main difference between them is that “Duplicate, Google chose different canonical than user” had already specified a canonical tag that Google didn’t pick up. In contrast, the two other statuses didn’t have any canonical tags defined by the user.
Additionally, you explicitly asked for the URL reporting “Duplicate, submitted URL not selected as canonical” to be indexed by submitting it in your sitemap.
If you see a “Duplicate, Google chose different canonical than user” status and you think Google didn’t choose the right page to index, there are a few things you can do to give your preferred page the best chances of being indexed:
- Be consistent in sending canonical signals: ensure only the canonical page is in your sitemap and internal links are pointing to it,
- Ensure each page has a unique value. If your product pages have the same description, add textual content indicating that the products are different,
- Ensure that your content renders correctly in the URL Inspection tool,
- Don’t only translate the content for different languages but also localize it for the specific country you target,
- Always remember to add hreflang tags for content targeting multiple countries.