SEO Office Hours – September 17th, 2021

seo-office-hours-september-17th-2021 - 0-seo-office-hours-september-17th-2021-hero

This is a summary of the most interesting questions and answers from the Google SEO Office Hours with John Mueller on September 17, 2021. 

 

Word limits

 

09:05 “What is the limit for the words that we have to use on [category] pages?”

 

John stated that there are no such limits and added that “[…] especially with category pages, you need to have some information on the page so that we understand what the topic is. But that’s generally very little information. And in many cases, we understand that from the products that you have listed anyway if the names of the products are clear enough to us to understand. […] But sometimes product names are a little bit hard to understand, and then it might make sense to add some context there. But it’s usually additional context in the size of maybe one or two or three sentences.”

No changes in Index Coverage metrics

10:57 “I’m working on a large website with a couple of million pages. We did a redirection between two languages. […] So the migration is going well, the traffic is moving from the second path. Just to clarify, it’s a projection between two subfolders or everything within the same domain. The thing I want to raise here is we are not seeing any change in the Coverage metrics after almost three weeks of having done that redirection. […] No dropping the valid pages, not an increase in excluded pages because of redirections or anything like that. Should we be concerned about that?”

 

“I don’t think there’s a fixed timeline for those kinds of things to be visible. […] We crawl pages at a different speed across a website. And some pages we will recrawl every day and other pages we will recrawl maybe once a month, or every few months. So if the content in these folders that you’re merging is content that is very rarely crawled, then it’s going to take a long time for that to be reflected. Whereas if it’s content that is very actively being crawled, then you should see changes there within, usually, I don’t know, a week or so.”

The Index Coverage report’s updates 

 

13:25 When merging two subfolders, “[…] the traffic is going from one part of the site to the other. Everything’s going well, and we are also seeing that in the log files. […] But we are not seeing this in the Coverage metrics. We are concerned that that might be either a reporting bug or that we should be waiting all day?” 

 

John replied as “[…] The Index Coverage report is updated maybe twice a week. […] A lot of the reports in Search Console are once or twice a week, so that might be something where it’s just a delay. But if you’re seeing the traffic going to the right pages, if you’re looking at the Performance Report and you’re seeing that kind of a shift, then I think that’s perfectly fine. I don’t think you need to watch out for the Index Coverage report for something like this. Because what usually happens when you merge pages like this is our systems first have to figure out that we have to find a new canonical for these pages. So you take two pages, you fold them into one. Then we see this is a set of pages, we have to pick a canonical, and that process takes a little bit of time. And then it takes a bit of time for that to be reflected in the reporting. Whereas if you do something like a clean site move, then we can just transfer everything. We don’t have this canonicalization process that we have to figure out. So I could imagine that it just takes longer to be visible there.”

 

Understanding the quality of the website

 

15:22 “My question is around […] specifically the algorithms with new pages […] when Google comes along and calls it and looks to understand it. Does it then compare those pages to older legacy pages on the site and say, okay, well, these pages are great, but these much older pages are actually rubbish. So that it will then affect the quality of the newer pages and the category pages. Is that something that the algorithms do […] to really understand the quality of the website?”

 

John said that “[…] When we try to understand the quality of a website overall, it’s just a process that takes a lot of time. And it hasI don’t knowa fairly long lead time there. So if you add five pages to a website that has 10 000 pages already, then we’re going to probably focus on most of the site first, and then over time, we’ll see how that settles down with the new content; there as well.”

 

Passing link equity

 

17:06 “If you get a backlink, [Google]’s not just going to assume that is a good quality backlink and then therefore just pass link equity to the site blindly. So does Google, when it’s crawling these backlinks, either look at the referral traffic and play that into the algorithm? Or if it doesn’t see that information, does it try to, I guess, assess whether there’s a high propensity to click on that link? And, so if there is a high propensity to click on that link, then they will pass the link equity? If there isn’t, then say you know, […] you could literally create a blog and link now. In that case, Google says, well, actually, there’s no traffic; there’s not really a lot going on here, so why should we pass any form of link equity? […] Does that kind of feed into whether or not link equity does get passed on to a site?”

 

“I don’t think so. We don’t use things like traffic through a link when trying to evaluate how a link should be valued. As far as I know, we don’t also look at things like the probability that someone will click on a link with regards to how we should value it. Because sometimes links are essentially just references, and it’s not so much that we expect people to click on every link on a page. But if someone is referring to your site and saying, I’m doing this because this expert here said to do that. Then people are not going to click on that link and always […] look at your site and confirm whatever is written there. But they’ll see it as almost like a reference. […] If they needed to find out more information, they could go there. But they don’t need to. And from that point of view, I don’t think we would be taking that into account when it comes to evaluating the value of the link.”

Domain site migration

 

20:53We did a site domain migration from one domain to a new one and followed all migration requirements and recommendations. We updated redirects, canonical tags, and we tested in a dev environment beforehand. We added the new property and verified the new domain in Google Search Console. When doing the Change of Address, we get a validation failed error saying that there’s a 301-redirect from the homepage and it couldn’t fetch the old domain. How do we pass the validation for the Change of Address tool?”

 

“First of all, the most important thing to keep in mind is that the Change of Address tool is just one extra signal that we use with regards to migrations. It’s not a requirement. So if, for whatever reason, you can’t get the Change of Address tool to work for your website, if you have the redirect set up properly, all of those things, then you should be set. It’s not something that you absolutely need to do. I imagine most sites […] don’t actually use this tool. It’s something more like those who know about Search Console and those who know about all of these kinds of fancy things, they might be using it. With regards to why it might be failing, it’s really hard to say without knowing your site’s name or the URLs that you’re testing there.

One thing that I have seen is if you have a www version and a non-www version of your website and you redirect step by step through that. So, for example, you redirect to the non-www version, and then you redirect to the new domain. Then that can happen that throws us off if you submit the Change of Address in the version of the site that is not the primary version. So that might be one thing, to double-check with regards to the Change of Address tool you are submitting the version that is or that was currently indexed, or are you submitting maybe the alternate version in Search Console.”

 

Adding multiple Schema types

23:36 “Can I add multiple Schema types to one page? If yes, what would be the best way to combine both FAQ schema and recipe schema on one page?”

 

You can put as much structured data on your page as you want. But for most cases, when it comes to the rich results that we show in search results, we tend to pick just one kind of structured data or one rich result type and just focus on that. So if you have multiple types of structured data on your page, then there’s a very high chance that we just pick one of these types, and we show that. So if you want any particular type to be shown in the search results and you see that there are no combined uses when you look at the search results otherwise, then I would try to focus on the type that you want, and not just combine it with other things. So ideally, pick one that you really want and focus on that.”

 

404s vs. crawlability and indexability

 

24:39Our GSC Crawl Stats report is showing a steady increase of 404 pages that are NOT part of our site (they don’t exist in our sitemap, nor are they generated by internal search). They appear to be Google searches that are being appended to our URLs, and Google is trying to crawl them.  Under the crawl response breakdown, these 404s make up over 40% of the crawl response. How do we make sure this does not negatively impact our crawlability and indexability?”

 

“First of all, we don’t make up URLs, so it’s not that we would take Google searches and then make up URLs on your website. I guess these are just random links that we found on the web. […] So that’s something that happens all the time. And we find these links, and then we crawl them. We see that they return 404, and then we start ignoring them. So in practice, this is not something that you have to take care of. […] Usually, what happens with these kinds of links is, we try to figure out overall for your website which URLs we need to be crawling and which URLs we need to be crawling at which frequency. And then we take into account after we’ve worked out what we absolutely need to do, what we can do additionally. And in that additional bucket, which is also like a very, I think, graded set of URLs, essentially that would also include […] random links from scraper sites, for example. So if you’re seeing that we’re crawling a lot of URLs on your site that comes from these random links, essentially, you can assume that we’ve already finished with the crawling of the things that we care about that we think your site is important for. We just have time and capacity on your server, and we’re just going to try other things as well. So from that point of view, it’s not that these 404s would be causing issues with the crawling of your website. It’s almost more a sign that, well, we have enough capacity for your website. And if you happen to have more content than you actually linked within your website, we would probably crawl and index that too. So essentially, it’s almost like a good sign, and you definitely don’t need to block these by robots.txt, it’s not something that you need to suppress […]”

Blocking traffic from other countries

 

27:34We have a service website operating in France only. And we’re having a lot of traffic coming from other countries which have really bad bandwidth, which causes our CWV scores to go down. […] Since we’re not operating outside of France, we don’t have any use for traffic outside of it. Is it recommended to block traffic from other countries?”

 

Here’s what John said: “I would try to avoid blocking traffic from other countries. I think it’s something where ultimately it’s up to you. It’s your website; you can choose what you want to do. […] In this case, one of the things to keep in mind is that we crawl almost all websites from the U.S. So if you’re located in France, and you block all other countries, you would be blocking Googlebot crawling as well. And then, essentially, we would not be able to index any of your content. From that point of view, if you want to block other countries, make sure that, at a minimum, you’re not blocking the country where Googlebot is crawling from. At least, if you care about search.”

 

Regrouping of pages

 

35:38Google has recently re-grouped together >30,000 pages on our site that are noticeably different pages for CWV scores. […] This has brought the average LCP for these pages up to 3.4s, despite the fact that our product pages were averaging 2.5s before the re-grouping. We were working to get the pages at 2.5s below the threshold, but our tactics now seem too insignificant to get us to the score we need to hit. Is the grouping set and then a score average is taken or is the score taken and then the grouping set? – (this will help us to establish if getting those product pages under the 2.5s will help solve the problem or not).”

 

“[…] We don’t have any clear or exact definition of how we do grouping because that’s something that has to evolve over time a little bit depending on the amount of data that we have for a website. So if we have a lot of data for a lot of different kinds of pages on a website, it’s a lot easier for our systems to say we will do grouping slightly more fine-grained versus as rough as before. Whereas if we don’t have a lot of data, we end up maybe even going to a situation where we take the whole website as one group. So that’s the one thing. The other thing is the data that we collect is based on field data. You see that in Search Console as well, which means it’s not so much that we would take the average of individual pages and just average them by the number of pages. But rather, what would happen in practice is that it’s more of a traffic weighted average in the sense that some pages will have a lot more traffic, and we’ll have more data there. And other pages will have less traffic, and we won’t have as much data there. So that might be something where you’re seeing these kinds of differences. If a lot of people are going to your home page and not so many to individual products, then it might be that the home page is weighted a little bit higher just because we have more data there. So that’s the direction I would go there, and in practice, that means instead of focusing so much on individual pages, I would tend to look at things like your Google Analytics or other analytics that you have to figure out which pages or which page types are getting a lot of traffic. And then, by optimizing those pages, you’re essentially trying to improve the User Experience […], and that’s something that we would try to pick up for the Core Web Vital scoring there. So, essentially, less of averaging across the number of pages and more averaging across the traffic of what people actually see when they come to your website.”

The MUM algorithm

 

42:44With the advent of the MUM algorithm, will search results be a response to multiple sources? I mean, that is when the user searches for a topic, the answers are selected from multiple sources and provided to him as a package? We think that competition in the future will become a kind of interaction between competitors. Together they can meet the needs of a searcher. Sites focus on services that they can provide better. Several competitors can meet the needs of searchers. The user needs portfolio is completed by several sites that compete for the best service, where they are better known.”

 

John replied, “I don’t know; maybe that will happen at some point. We do have a notion of trying to provide a diverse set of options in the search results where if we can tell that there are maybe things like very strong or different opinions on a specific topic. And it’s a question of what opinions are there out there. Then it can make sense to provide something like a diverse set of results that cover different angles for that topic. I don’t think that applies to most queries, but sometimes that’s something that we do try to take into account.” 

 

Site migration 

 

47:11 What if when doing a site migration and the day we pull the trigger, we: robots.txt block both domains […], do 302 temporary redirects (in a few days or weeks switch to 301s after Devs are sure nothing broke), and serve 503 HTTP status sitewide for a day or a few hours as well while Devs check for anything broken?“

 

According to John, “[…] These are all separate situations. And it’s not the case that we would say, well, this is a site move with this variation. But rather if you’re blocking things, if things are broken, then we would, first of all, see that as something that is broken. And if at a later stage we see they’re actually redirects, then we would say, well, now the site is redirecting. And we would treat those as separate states. So if on the day you want to do the site move, something breaks and everything is broken on your server, then we will just see well, everything is broken on the server. We wouldn’t know that your intent was to do a site move because we will just see that everything is broken. So from that point of view, I would treat these as separate states. And of course, try to fix any broken state as quickly as you can, and essentially move to the migration as quickly as you can after that.”

 

Removing old content from a news site

 

49:42 “Is it worth looking at removing/no indexing/disallowing old news on a news website? News like 10+ years old? Does it do something to the quality of the website in general, and does it optimize the crawl budget? It’s a site with 3+ million pages on it. Does it do something to that?” 

 

“I don’t think you would get a lot of value out of removing just old news. It’s also not something I would recommend to news websites because sometimes all the information is still useful. From that point of view, I wouldn’t do this for SEO reasons. If the reasons why you want to remove content or put it into an archive section on your website for usability reasons or maintenance […], that’s something that you can definitely do. But I wouldn’t just blindly remove old content because it’s old.”

 

English Google SEO office hours from September 17, 2021