Rendering SEO: How Google Digests Your Content

how-google-digest-your-content - 0-how-google-digest-your-content-hero-image

This is a transcript of a conversation between Bartosz Góralewicz, Martin Splitt from Google, and Jason Barnard. They hosted a webinar to discuss Rendering SEO in practical terms. You can watch the webinar recording here, but since it’s packed with so much information, we hope you’ll also find this transcript to be helpful!

Bartosz: Today, we’re gonna look at rendering from Google’s perspective, which is a little different from what we’re seeing in Chrome, and hence Martin is here to navigate us through those murky waters.

And just briefly, in our research, and again, this is not coming from Martin, we started seeing first mentions of rendering and layout in Google patents around 2011. And my personal theory is that’s why Google Panda content quality updates and all those wonderful things started to happen around that date.

And there are a lot of new findings, this is a fairly new field of how Google looks at both layout and rendering, and this is something we hope to make a little bit more friendly here with Martin. So this is the goal today – to make this as simple and usable, and practical as possible. 

As I mentioned, most of the Google patents that we got to around this topic, and that’s how this actually started, they focus on layout.

The layout seems to be quite important, and we probably know that, like, text appearing above the fold is more important, and there are a lot of patents that will tell you that some elements of the page have a little different role than, e.g., ads or text that’s below the fold. 

So this is one, and for years we were focusing on JavaScript SEO as Jason mentioned, and when going deeper, we realized JavaScript SEO was mostly, “can Google see your content properly, can you change that JavaScript to HTML” and things like that, but when we dove a little bit deeper we saw that this is just the tip of the iceberg.

A lot of the aspects of how Google renders the content and how they see the layout is gonna affect most of the SEO that’s happening after this initial phase of crawling, rendering, and indexing.

So we as an agency left the field of JavaScript SEO a little bit, and we dove into Rendering SEO which is way more complex, way more exciting. Even though, we are going to try and keep it simple today. Jason, you have to be the guard of that.

There are a lot of things that are quite exciting, we probably won’t get precise answers from Martin, he hates this slide, I’m sorry, Martin.

There are things Google mentioned like they will interrupt scripts, a lot of funky little bits and pieces that are interesting, but just to tell you for all the people who are not as advanced in technical SEO – why layout and rendering SEO matters. Sometimes Google won’t pick up the whole page you posted.

So if you’re seeing your URL is indexed it doesn’t really mean that Google indexed the whole thing. This could be due to rendering, quality, technology so that’s where it gets quite colorful and exciting.

Something I wanna mention when we talk today is that there are four shades of your website. Without knowing, a lot of us cloak the content, because your content now looks different on mobile with JavaScript and without JavaScript, and this goes for most websites nowadays, so it’s not only those React- or Angular-powered websites, it’s also about WordPress, Wix, maybe Duda as well, and most of those simpler frameworks. We also have the same problem with desktop. So there are so many different ways you can interact with content, and it’s mostly because of rendering, how this code is gonna be rendered on an end device.

Without further ado, let’s begin. We have Martin here so I won’t take too much of your time.

I have the first question just to start it up somehow. So Martin, can rendering SEO help me rank better? I’m assuming this is the first question that’s in everyone’s head.

Is it practical, is it something that will get us traffic, leads, all those funny and cool things? 

Martin: I mean, I usually don’t answer ranking questions, I’ll make an exception here.

Generally speaking – no. But specifically speaking, if there is a problem where something breaks your render and the content doesn’t show up then Googlebot does not see the content or does not see it properly, then, you know, that may actually hurt you in the sense of, we don’t see the content. 

So we might not index the page. Or we might index the page but not rank it for the content that you care about. So yes, in the end, it can make a difference and have an impact – yeah, of course.

Rendering issues on your website may contribute to your URLs getting indexed without content.

Read the article on how to fix the “Page indexed without content” status in your Google Search Console.

Bartosz: One thing I want to do before we dive into the questions from the audience is, where does rendering sit in the whole scenario?

We can quickly get into a scenario of what was first, the chicken or the egg, but my understanding of that was always that Google creates a queue and then crawls, renders, and obviously optionally indexes the page – would that be oversimplification?

Martin: That is slightly oversimplifying but is fundamentally true. So we get lots of URLs and we get so many URLs that we can’t crawl them all at the same time for the obvious reasons. Ok, I shouldn’t say for the obvious reasons. We can’t crawl all the URLs all the time, at the same time for reasons of bandwidth, so there’s only so much internet bandwidth that we can use. 

If you have an online shop, and you come online tomorrow with a new online shop website and you have a million product URLs, your server might crash if we crawl all these URLs at the same time, so we have to spread this out through time, so there’s a queue in between us discovering the URL and the URL actually being crawled. 

This queue is relatively transparent in the sense that the Search Console shows you when the URL has been last crawled and also it’s transparent in a way that your server tells you. You can check when was the last request done to this URL from a Googlebot User-agent and IP, so that’s a very transparent queue. 

What happens then is once we have it crawled, we can look into the HTML that we received, we can look into the HTTP status. If it’s a 404 status then pretty much the processing ends here. If there’s a robots meta tag that says noindex then our work ends here as well. 

But if we get a bunch of HTML content and we can go forward with processing it and the rest of the pipeline, we also then queue the page for JavaScript execution, which it’s what we would call “rendering.” The second queue is very opaque, in a sense you don’t really see how long it takes us to render, if we render at all, when we render, you don’t know, and that’s not accidentally so, because theoretically, we see this all as a transactional process, in the intake, there’s the URL that we discovered and the output of that is indexed document or non-indexed document. That’s pretty much what can happen here.

And there is not that much that you can do about rendering really in terms of changing the queue position or doing much in terms of figuring out what is rendered or what should be rendered. You see that in the Search Console as well, you see what comes out of rendering if you look at View Crawled Page, you’ll see what we would see there. So there isn’t an additional queue that you skipped over, and there are a few more complications where the simplified model might not necessarily apply. But you can assume the flow normally is discovering, crawling, queuing, rendering, indexing, and then potentially ranking later. 

Jason: That’s really clear.

Bartosz: Just want to clarify one thing because you mentioned that, before this gets into a topic online, you mentioned that you render the page and you mentioned JavaScript as well, but what I got from our previous conversations is that rendering is not only about JavaScript.

Martin: Oh, yeah, that’s true.

Bartosz: So even if you’ve got a non-Javascript website, like there is no line of JavaScript and no referral to external scripts, you should also be concerned with rendering. 

Jason: I was about to ask the question: How many of us are actually concerned by rendering because that conception, the idea that it’s JavaScript only, we’re all concerned…?

Martin: Yeah. All of the websites are being rendered and all of you are to some degree concerned, yes, that’s true, that’s correct.

Jason: Basically, if you don’t have any JavaScript, do you have to worry about this at all?

Martin: You don’t necessarily have to worry about it, but you’re still affected by it. There are still potential implications from rendering. 

Jason: Ok, brilliant.

Martin: That ties back to what Bartosz said earlier, like the text above the fold or where does Google think your main content is and stuff like that.

Jason: Right, yeah. Which is brilliant. I mean, it basically says, a part of rendering is basically understanding what role each chunk of the page plays. Bartosz showed us a screenshot where some of it was not indexed, some of it was ads, some of it was a header, some of it was the footer. Rendering is the point in which Google decides what role each part of the page plays, therefore it can make this decision on whether to index it and whether or not to prioritize it, in terms of what Bartosz was saying – is this the main content?

Bartosz: That’s all correct.

Jason: But basically, how does it decide?

Martin: So maybe we need to talk a little bit about what rendering really is? Because I’m not sure if everybody knows what that means, right? Should we do that?

Bartosz: Sorry, Martin, this is a very good moment for all the people watching to ask all of the questions about every single item you didn’t understand right now. 

Martin: Yeah!

Bartosz: We have no idea with Martin at which point we’re going to lose the audience, but what we are trying to address to be fully transparent, the feedback from our previous conversation was that sometimes we got a little too geeky, too nerdy, so we want to really make this simple. 

Jason: Brilliantly said. Define rendering, what is it, what’s the process? 

Martin: Right. If you think about HTML as a recipe and you have all the ingredients there, you have a bunch of text, images, stuff. But you don’t really have it in the recipe, the recipe is a piece of paper with all the instructions on how to make the thing.

The resources of the website are the ingredients, such as the CSS, JavaScript files, and the images the videos, all of that stuff that you load to actually make the page look the way it looks afterward. 

The website you know and use in your browser you see in your browser – that’s the final dish.

Rendering is pretty much the cooking, the preparation process.

Crawling fundamentally just goes into a big book of recipes and takes out a page with the recipe and then puts that into our realm reach, basically we are standing here on a kitchen table and we just wait for the cooking to begin, and crawling will basically hand us the recipe.

And then rendering is the process where rendering goes, “AHA, interesting! Crawler, over there, can you get me these 10 ingredients?”, and the crawler conveniently goes, “yes I got you these 10 ingredients that you need”, and we start cooking, that’s what rendering is.

So rendering parses the HTML. HTML fundamentally when it comes form crawling is just a bunch of text, conveniently formatted, but text.

In order to make that into a visual representation which is the website really, we need to render it which means we need to fetch all the resources, we need to fundamentally understand what the text tells us.

Eg. Oh, so there’s a header here, ok, then there’s an image there, next to the image there is a bunch of text and then under the image, there is another heading, a smaller heading, a lower-level heading, basically indented in the structure of the content, then there is a video, then below the video there’s more text and in the text, there are 3 links going here, here, here.

And all this assembly process, this understanding what it is and assembling it into a visual representation that you can interact within your browser window – that is rendering.

And as part of the rendering, at the very first stage – we execute the JavaScript.

Because JavaScript happens to be basically a recipe within the recipe, JavaScript can be like, “now chop those onions”. So you have the raw ingredients which are the onions, but you don’t put the onions as a whole into your dish, you cut them up, and that is what the JavaScript is needed for.

So if I’m not executing the JS and just fetching the resources, I might actually not get the step where I actually need to chop the onions, or crack open an egg and whisk it into something, who knows.

Can you tell that I just made dinner and I’m really hungry right now? I’m so very sorry!

And the JavaScript execution is just part of rendering but then when the JavaScript execution is finished, or if there’s no JavaScript execution, that is fine too, but what then happens we are assembling, figuring out these bits and pieces and how we need to, like, assemble them on the page and that leads to something called layout tree, and the layout tree tells us how big things are, where on the page they are, if they’re visible or not visible, if one thing is behind another thing.

This information is important for us as well just as much as executing the JavaScript, because the JavaScript might change, delete, or add content that wasn’t in the initial HTML as it has been delivered by the server.

So, thats rendering in a nutshell. From “We have some HTML” to “We have potentially a bunch of pixels on the screen”. That’s rendering.

Bartosz: I want to add a few things that Martin won’t say, because he work at Google. Looking at that from a technical SEO side of things, rendering is quite costly – how that affects Google exactly is a great mystery and this is something where we keep researching, we even launched a separate company, a tool to do that. Long story short, rendering is quite expensive for Google, for mobile devices. If you have like an Alcatel 1X, like an older phone, maybe a cheaper phone, it’s gonna struggle with website like BBC, The Guardian, which ship a lot of JavaScript. Which is also something that Google is gonna struggle with. But long story short, rendering is expensive. And in our experience, some scripts, some websites don’t render optimally for Google, and they end up not being picked up properly for a number of reasons.

And this is where Rendering SEO comes in. This is where you talk to your dev team or tech SEO team with experience in this topic. And you’re telling them, maybe my pages are not being picked up as quickly as I’d like.

What we would see quite often as a sign of issues with both rendering and then indexing, is for example you have a very dynamic website, you have a ton of Javascript, and you see that your content is picked up very slowly. For example, you publish a new listing on a real estate website and you see that Google picks it up after 3 weeks, and your competitors are picked up after a day. And that is a scenario when we would sit down and look into what’s happening here, why is Google struggling with this aspect?

What’s your take on this Martin?

Martin: I like the question because it creates an interesting moment here.

Because if you think about it, Google Search has the exact same struggle as a real-world user in this case. Because for a real-world user, even if you are on a modern phone, and a really fast and fantastic and expensive phone as well, more execution also always means more power consumption.

There have been people who called JavaScript the CO2 of the internet. I don’t think that’s completely wrong. That’s a very nice and apt comparison.

Here is Google Search and here are the users actually using your website, and we are in the same boat.The more expensive you make it, the worse it is for us as an experience. Google Search doesn’t really care, we just need to invest in resources that we need and we do a lot of optimizations to make sure we are wasting as little time and energy as possible. But obviously, if you’re optimizing that, a nice side effect is that your users will probably also be happier because they need less battery, their old phone will still work fine with what you put out there, and they will be able to consume your web content and maybe not your competitors’ because your competitors just don’t care and actually produce something that is less convenient to use on their phones. So this is not something where you pit Google vs UX, this is kinda like the same problem or the same challenge, and we’re all facing it, including Google Search, so that’s a nice one.

Jason: From my perspective, you’re saying: optimize, make it easier for Google – what is the reward that Google gives us? Is it simply that it’s faster for you, therefore, you can do more at the same time?

Martin: Not really, because rendering happens outside, rendering is asynchronous so it means that it’s not that we wait until rendering happens and then we process, we basically try to do as much in parallel as we can. For instance, if structured data is there in the initial HTML we may pick it up right after crawling, while rendering is happening, so you get marginal advantages there. You might actually get bigger advantages there.

If you have something where content changes constantly, then making that available as early as possible in the process, and the same goes for canonicals, titles, meta description, structured data – as early as possible, is good for you in terms of us picking it more frequently or quicker.

Jason: Yeah, the earlier it is, the more likely that you will pick it up, the less likely it is that you drop off before you get there.

And you mentioned schema markup and you were talking about the layout tree, and I’m incredibly intrigued because you get this layout tree coming in, schema markup is potentially part of that layout tree, and you basically say this is schema, this is the ads, this is the header, is that correct? Is schema part of that?

Martin: Schema is not part of your layout tree because it is not a visual element. So everything that is visually or potentially visible is part of the layout tree. Schema markup would never be visible.

There is a bunch of trees involved and I don’t want to get everyone confused. Fundamentally what happens is that the first moment we have the text, we break that down into individual elements and create a Document Object Model – that’s the DOM that people refer to. The DOM is basically just the browser’s way of saying: So I have this title, that has text inside, that’s another node in this tree, and then I have this text block here, and inside the text block there’s a link which has text inside the link, then over here is an image and you know, that’s everything that is on the page, and that includes schema markup as well as all the meta tags and the title and everything.

Anything that is invisible like meta information, schema, JavaScript, and CSS is not part of the layout tree. The CSS indirectly, by its effects, is parsed into the tree as well. For good measure, there’s also a CSSOM and that’s the exact same thing as it is for HTML, but for CSS. And it maps to the DOM to create the appearance that you have styled in CSS on the DOM elements, on HTML elements. But on itself it would not show up in the layout tree. 

Jason: Bartosz, you were showing elements that are not indexed. And that’s presumably the layout tree, you’re making a decision before you get to the indexing stage, “We don’t want it, it’s not interesting or too long or too light.” Is that how you understand it Bartosz?

Bartosz: There is a number of reasons why part of the content may not be indexed. It’s not always rendering’s fault, it’s a key thing to have in mind.

Sometimes it’s partially rendering’s fault. If you look at content that’s visible on desktop but not visible on mobile, it’s because of rendering, but the problem isn’t that Google rendered that in a wrong way, the problem is that you’re not showing the same content for desktop and for mobile, which happens quite often, for example for eCommerce stores.

So there may be a number of reasons, sometimes, we assume, and this is something Martin would have to either confirm or deny, we assume that Google skips an element of the page that somehow is not really important or relevant to the content. So if you have content, I think this is what we talked about the last time Martin, about dogs, and below you have a bunch of bike advertisements as “Similar items”, then quite often those won’t be picked up by Google. And why this happens is probably that Martin has a bit more knowledge into.

Martin: That’s not rendering, that’s just us analyzing the content. I don’t know what we have publicly said about this but I think I brought it up in one of the podcast episodes – we have a thing called the centerpiece annotation for instance, and there are a few other annotations that we have where we look at the semantic context as well as potentially the layout tree.

But fundamentally we can read that from the content structure in HTML already and figure it out. “This looks like from all the Natural Language Processing we did on this entire text content that we got here, it looks like it is primarily about topic A – dog food – and then there is this other thing here which seems to be links to related products but it’s not really part of the centerpiece, it’s not really main content here, it seems to be additional stuff, then there is a bunch of boilerplates”, so we figured out that the menu looks pretty much the same on all these pages, and this looks like the menu that we have on all the other pages of this domain, or we’ve seen this before.

We don’t even actually go by domain or, “Oh, this looks like a menu”. We figure out what looks like boilerplate and that gets weighted differently as well.

So if you happen to have content on a page that is not related to the main topic of the rest of the content, we might not give it as much consideration as you think. We still use that information for link discovery and figuring out your site structure and all of that but if your page has 10000 words on dog food and like 2000-3000 words on bikes, then probably this is not good content for bikes. 

Jason: Does Semantic HTML5 help you in any way?

Martin: It does help us but it’s not the only thing that we look for, yes.

Jason: Right, okay. It is a slight help but it isn’t something major we should rebuild our entire site for, because you can guess it.

Martin: Yes.

Bartosz: So just to close this topic and move forward, like sometimes you will also see part of that content, like related items – this quite often relies on JavaScript. Very often on a lot of JavaScript.

So you know, in our experience again, sometimes if an element is very heavy and doesn’t really add up to the value of the page, like again a carousel of similar items, sometimes it’s technology-driven, sometimes we assume that this is because it’s a ton of rendering to do and there is no value in the end.

Do you also do, do you weigh that sometimes, do you have, this is gonna be a very difficult question to answer, do you have an algorithm that’s gonna say ok, this costs a ton to render, but it doesn’t bring that much value, we should skip it. Is it something that you can talk about? I’m not sure if this is a question we should avoid?

Martin: It’s not a question we should avoid, it’s an interesting question.

As far as I can tell, no. We are not like, “oh this is heavy to render, we’ll skip it because it doesn’t add much value”, that would be an interesting heuristic to consider but I don’t think we do that as of now. What we do however do is that eventually, and I say eventually, very specific and I’m being vague on purpose here because there is no very clear cut-off moment and if there were, I mean there is some, but it is subject to change, and I don’t want people to focus on something that doesn’t make sense.

There is a moment when we say, “ok, this has been rendering for long enough, I think we’re good.” There are again a bunch of heuristics in place but basically, if a real user would consider, “this is too long”, then we would consider it too long as well.

So yeah.

Bartosz: So I think that for the audience, one thing to stress quite a bit, this is something we talked about before the webinar, you mentioned resources quite a bit. What resources should we worry about to make Google’s life easier?

Martin: Mostly I would worry about JavaScript files to be honest.

Bartosz: No, resources like, server resources. Last time you explained it quite well and I think this would bring value to the audience.

Martin: What did I say back then?

Bartosz: You mentioned that you don’t care that much about RAM, that you worry about CPU for example. I don’t remember what you said about storage, but that was also somewhere in the conversation. But fro my assumption, CPU is a key element to watch out for right now, so if you website requires a lot of CPU time to render, this is something to look into, this is something to optimize. Would that be correct?

Martin: Yeah, so now I know where we’re going with the question, thank you very much for helping me out there, Bartosz.

Yes, so as I said originally, in the browser, and in Google Search, rendering is roughly taking the text to pixels on the screen.

The last part is not true for Google Search. Google Web Rendering Service which is the part of Google Search that actually renders, does not care about pixels so we are not painting the actual pixels, so if you see something that is very paint-expensive, I’ll be happy to explain that if that causes confusion, if you have something that’s very paint-expensive, you don’t have to worry about that, because we are not using actual GPUs to actually paint any pixels so we don’t care about anything paint-related. 

Expensive layouts are tricky, and with layouts I specifically mean the layout work that happens on the main thread – because that costs CPU time and CPU time is what is most precious to us. Storage and memory, not that critical, as far as I can tell from today’s websites, they’re mostly expensive because they hog CPU resources.

Jason: And one question here as well, moving forward with that is: The more you see a specific type of site, does it get easier to render? Or is there no consequence of using a platform like Duda or WordPress? Does that help or is it completely outside of the box?

Martin: It might or might not help. The reason for that is that in theory, the nice thing about platforms is whenever they optimize the actual platform, you get this optimization for free. You don’t have to actually do anything about that, so that’s nice. If you build your own thing, then you have to do the optimization work and never ever is some optimization magically falling into your lap and things are getting better. But that’s true for pretty much any premade, or open-source, or shared, publicly used CMS or platform. On the other hand, platforms, because they are so broad and generalistic, they oftentimes carry a lot of dead weight around, just because some websites might be using a certain functionality, and if your website doesn’t, then it actually carries this weight still, and that might be not great.

Jason: Is it important for me as a webmaster or developer to remove the dead weight, even if it’s not doing anything just to get rid of it so it doesn’t get in the way?

Martin: Sorry, can you run that past me again?

Jason: You were talking about the dead weight, if I can remove it is that a phenomenal win for me?

Martin: Yes, but the thing with platforms is that you normally can’t…

Bartosz: So, I think we can also take a tiny step back to the crawler budget, which was mentioned. So, one thing with rendering is that sometimes, let’s say, like this is something we’re not looking at when looking at crawler budget and, we as SEOs, that it’s not only about the main HTML file, but if you have a JavaScript file that’s gonna expand, that’s gonna be processed, and gonna request like 10 more files, these are all, as I understand, these also go into crawler budget.

So there’s quite a lot of value, as I’m seeing that, in reducing the number of requests. Obviously, this is a little bit more complex than just reducing the number of requests and creating one huge file, but there’s a lot of value in simplifying both the code, the number of requests, and the weight – basically, the cost of rendering.

As you mentioned, Martin, some of the websites are real guzzlers. So, you need, like, a ton of resources to go through those. And those resources, it’s not only about rendering as I’m seeing. It’s also, you know, hundreds of requests, one file changing the other file. This must be difficult to render on a scale as well.

Martin: The good thing is that the web rendering service uses the crawling infrastructure to make resource fetches so that means that it uses infrastructure specifically built to crawl the web at scale.

So the network part is not the worst. What is tricky, is that in order, so, ok…

There comes another problem, or, another challenge here, which is the challenge of timing and orders of magnitude.

If you are writing a computer program, programs tend to do either one of two things: they are either CPU-bound or they’re IO-bound. What does that mean?

If I have a program that just does number-crunching, so I give it two numbers in and it does something. Let’s say we’re trying to calculate pi as precisely as we can, so we want to get to the trillionth digit of pi.

That is an operation that will just require number-crunching in the CPU. The CPU is built for number-crunching, so it’s going to be really fast at doing that but our program will be what’s called CPU-bound.

This means how fast the program runs and how long the program takes to accomplish its job is bound by how much CPU resources you can throw at it, which is generally more predictable than something that is IO-bound.

What does IO-bound mean?

IO-bound means, if I say “Write me a program that lists all the files on my hard drive or on this CD-ROM or this USB drive.”

This program no longer needs to, just like, run through the CPU and will basically just flip numbers to the CPU. But now it’s actually, my program asks an external piece of hardware, by external I mean outside of the CPU.

The CPU has to ask the hard drive, CD-ROM drive, pen drive, doesn’t matter, to get the data and the data needs to be put in a place where the CPU can access it then the CPU reads it and makes a decision based on the data it finds.

Basically, it reads the first directory and it finds the first file in the directory, reads it, the file comes back, now it needs to read the second file… And that’s what IO – input-output – bound is. 

The choking factor, the thing that determines how fast something can be done here is no longer the CPU, it’s the input-output, how long does this take.

The fastest is… if you talk to what is called a CPU cache, that memory that’s usually inside what we would call the CPU, the central processing unit, the second fastest is if you read from local memory, from RAM. The next fastest would be a local SSD drive. And so on and so forth.

Network, unfortunately, is thousands and thousands of times slower than any of these local file accesses and these are a thousand times slower than memory access, and that’s thousands of times slower than the access of the cache in the CPU. 

And be careful, when I say cache, I mean a very specific tiny type of memory chip.

There is another cache, which we are using because we are IO-bound, as you can imagine, we have to fetch the resources. So it’s not about the CPU or executing the JavaScript. 

What takes a lot of time is going to the network and fetching the JavaScript from your server and getting that back, that is always going to be thousands of times slower than having it stored somewhere on a, let’s say, hard drive, on an SSD drive somewhere in our data center.

So, we have a completely different cache, not the cache I mentioned earlier, that’s a CPU internal bit and piece and we don’t care about that.

We have a cache where basically whenever our crawler infrastructure fetches a resource, we store it on a drive inside our data center. So we are reducing these thousands of thousands of seconds down to a few seconds. And the problem is that if you split up your application across lots of files, let’s say, a dozen files, a dozen of JavaScript files specifically, then we have to fetch all of these, now we only have to fetch these once probably and we can cache them.

But what if these files constantly change? Then we can’t cache them. What’s worse is, you may think “Oh, that’s just one big file and all of it is in one file and that’s better to cache!” No!

Because if you split it up the right amount, let’s say instead of dozens of files, you now have 4 files, and from these 4 files only 2 change on a daily basis or on a weekly basis, then we can use the cache for at least the other files that never change. Good job.

But if you have all of this in one big file and it will constantly change and we can never make use of our cache. So we always have to go through the much slower network – that’s a consideration that’s very tricky to navigate.

And I can’t give you hard and fast rules or numbers for it either.

It’s a case-by-case basis, you wanna figure out how much of the stuff really has to change a lot, what other stuff is kind of stable and doesn’t change as much, you wanna separate these two things.

Jason: Does that also mean that the network, the slowness of the network if you’re pulling in files from different sites? That also?

Martin: Yes.

Jason: Which brings us to the question from Simox Cox, which is aimed at you, Martin: what can we do to help render Google’s own scripts, analytics, and fonts?

Martin: You don’t have to worry about analytics, because as far as I’m aware we are skipping analytics, with fonts, I think we are skipping them as well.

Jason: So we don’t have to worry about them from the rendering perspective.

Martin: No, not from the rendering perspective, let’s stick with that.

Bartosz: Just to add to that, one thing, at least to my knowledge, that you don’t have to worry about with that is image fetches. That’s why it’s worth having image dimensions in the code because WRS – Web Rendering Service – also doesn’t request images.

If you think about that, if you have a lean website of just HTML, CSS, like a lean JavaScript file, this does the trick. In our experience, this can improve rendering, indexing, crawling quite a bit just by cleaning up, a little bit of code-housekeeping, let’s call it that. 

Jason: You said about images though, it’s very interesting because from a rendering, and getting the layout tree perspective, setting the size of the image or specifying is phenomenally important, and most lazy developers like myself don’t bother doing it.

Bartosz: So, if you do that, according to my knowledge, Google doesn’t have to request those files. So that’s quite cool because you also cut the time of those requests, and you make Google’s job a bit easier. Let’s go to other questions.

Jason: Johann had a question: Is the rendering stage mandatory for a page to get indexed or could it be considered for indexing without the rendering stage?

Martin: It could hypothetically happen, but in practice it normally doesn’t.

Jason: So all pages need rendering, with 1 or 2 exceptions, but none of us are likely to be concerned by that specific exception.

Bartosz: So I’m assuming this comes from those two waves of indexing. So Martin, what’s the status on that right now? It’s complicated as far as I can remember.

Martin: So I joined Google in  April 2018, I remember that. Once I did my onboarding, Google onboarding, all of that lovely, cuddly nice time that you get when you get onboarded at Google, once that was over I sat down at my desk, and then I talked to John, Maria, Tom, and eventually, they were like, “We’re prepping for Google I/O” and I saw the slide deck and I looked at two waves of indexing and I said to John, “Are you sure we wanna go with that?”, and he’s like, “Yes, I think that makes it clearer, what happens”, and I’m like, “Yes, okay, I can see that it’s a nice and easy explanation.”

And at least I, I know that John wasn’t surprised, but I was surprised by the conclusions people drew from that.

And I was like oh God, okay, lesson learned here, that was an oversimplification. It was making what happens behind the scenes to simple and it implied a bunch of stuff that were not meant to be implied, for instance there’s like “Yeah, things get indexed without the rendering happening, and then the rendering happens afterwards and changes the indexing.”

Jason: But that isn’t the case?

Martin: It can be in certain cases, but it’s very rare. It’s also my fault because I looked at it and I said, okay, that’s good.

Bartosz: To be fair, we saw, and everyone in the community dealing with JavaScript saw, quite often JavaScript would change metadata. Why you’d do that is beyond me.

Anyhow, sometimes you have one title, or one meta description with JavaScript disabled, without rendering, and you have a different one when Google renders.

And we saw websites when different pages were indexed differently. And that’s why many people in the SEO community were like, okay, this page is waiting for the second wave of indexing.

Anyhow, Jason, let’s go with another question.

Jason: No, right, the question you were asking there, people rewriting titles, for example, people do it with Google Tag Manager because they can’t get their hands on their titles because they can’t get the access to it.

That is a phenomenal problem for you guys. I mean, what you’re saying is, from what I understand, sometimes you pick up the original one, sometimes you pick up the one inserted by Google Tag Manager.

Martin: Yeah, yeah.

Bartosz: And this goes beyond just meta title and description. You will see JavaScript rewriting the links, rewriting the whole structure, breadcrumbs, and with this happening, this must be really difficult to index that page and crawl that page quickly and efficiently.

Jason: So maybe we can change that to the question about, when a page changes overtime with the JavaScript even as it’s loading which is the case with the meta title I just gave, how much of a problem is that for Google, so we’re turning that into a more general question?

Martin: Sorry, can you run that past me again?

Jason: The fact that as the page, the page loads and things change before the users even interact with it, because you coded the things in to override because you’re not very good at your job. How much of a problem does that cause?

Martin: Unless you have proof that it really is a problem for you, I wouldn’t worry about it.

Normally it shouldn’t be that much of a problem because normally, the clearly better content and information should definitely be there after rendering and might or might not be there previous to rendering.

What is tricky is when that’s not the case, one, and the other thing is coming back to what I said earlier, the earlier you can get us data, the better, and I’ll append to that.

Because when you think about it, if we were having a conversation, and I tell you oh, by the way, the restaurant to the left is terrible and the restaurant on the right is really, really good, but then 10 seconds later I’m like, no, actually, the restaurant on the left is amazing, the restaurant on the right I wouldn’t bother with.

What do you do with that? That’s kind of the same, if I have a title and a canonical at the beginning of the process and then that changes, then which one is the right one? How do we find out?

Bartosz: One thing to remember is, if you’re leaving that decision to crawlers, and I’m not only talking about Google, because that also goes to Twitter, Facebook, Bing, all the other creatures of the web, if you leave that decision to Google, you create like a layer of chaos in your structure.

Because you don’t know which pages are gonna be picked up which way, and even if just some of them won’t be picked up properly, which again, sorry Martin, probably I’m not helping, but we’re seeing a lot of cases where the signals from your end, so you have one version with HTML, one version with JavaScript – oversimplifying – then we’re seeing different artifacts when this website is being crawled and rendered and indexed.

And I think this is proper development, this is something we need to, that’s why we try to talk about these topics with Martin quite a bit, because this is something in the gray area between SEOs and devs.

Because, you know, it’s a very difficult topic right now, is this something that technical SEOs should focus on? Not all companies have the luxury of technical SEOs in-house. Or is it something that the dev team should worry about? I guess the main topic is to look into that.

We have a tool – What Would JavaScript Do? – and if you google the tool, you can see, okay, which elements of the page are being changed with JavaScript. So this is very simple, just do that, look into those elements, just match those two and you’re good. Even if you have to depend on JavaScript.

Martin: To be the devil’s advocate for the developers, if all you have is client-side rendering and there are situations where you might for some reason have to do that, then it’s not super easy to provide something server-side first, like in the initial HTML, and then updating something that is missing or that’s very generic into something more specific and more high-quality, with JavaScript, is still a good thing over not doing it at all.

I’m not saying it’s good, I’m not saying it’s optimal and I absolutely 100% agree with Bartosz that you should make it match, but if you really can’t, it is a way of doing things, it’s just more shaky and error-prone than if you can avoid that.

Jason: One question many people are asking is, do authoritative sites get more “rendering” resources from Google or it doesn’t matter?

Martin: No. You have to have this one meta tag, meta cheese=”” and then your favorite cheese, if it’s the right kind of cheese and it changes weekly, then you get more rendering resources from John personally.

Bartosz: To support Martin’s statement, being 100% serious right now, we’re seeing websites, like home pages of newspapers, of huge eCommerce stores where you’d see main content not being picked up and we’re seeing small websites indexed properly with similar technological stack, so I have to confirm, we’re also not seeing like huge websites or heavily linked websites having some kind of benefit.

(…)

Bartosz: In general, something that we talked about with Martin. Rendering is so crucial with all the websites pushing so much JavaScript right now. And I guess, Martin, what would be…

Is there any way we can make rendering more interesting, more sexy in a way as SEO community?

I think this is something we talked about it so many times. We both know that rendering is so important and for the first time Google is not that black box anymore, we have so much data, Martin is available with all the answers, just… Not too many people seem to care about it.

We launched the Rendering SEO manifesto like June last year, I was thinking this would change the industry, I was thinking, this was gonna be this explosion within the industry, but this is not being picked up. And Martin, is there anything we as SEOs can do to push the envelope on this one?

Martin: That’s again a tricky one because technically, I spoke to the rendering team about this, and they’re like, “We like that rendering is not sexy”, and I’m like “Yeah, but there are people very worried about it and there is a bunch of stuff where you can miss out.”

I would just love people to experiment more. There are a few people in the community that are experimenting a lot, Giacomo Zecchini being one of them, I know that Dave Smart is experimenting a lot. And it’s just really, really cool to see people experimenting and telling me what they’re observing and checking in why what they’re observing is what they’re observing.

I’ll give you a very simple example. Adam Gent was the first one to very publicly point out that they’re seeing features supported by Googlebot in JavaScript rendering that weren’t supported before, so he was the one who caught us publicly in rolling out the evergreen Googlebot, and that made me very happy. Because a bunch of people just asked when it’s coming, and I can’t really say because we can’t obviously pre-announce something because the announcement might shift back in time or there might be a problem with it. But if someone says like, “Hey Martin, we’re seeing this thing. What is going on here?”, then I can say, see, this is what’s happening right now, we’re ramping up the percentage of renders that are using the evergreen Googlebot. And this page you just had there, was one that actually saw the new evergreen Googlebot.

I think it was Giacomo Zecchini who caught me on weird behavior with web workers, which is really interesting because I have been talking to the rendering team about this for a long time, and they’re like “No one is using it, get over it!” and now people are starting to use it and I’m like, we need to look into it.

There’s like a lot of interesting, tiny little surprises that normally don’t matter, they don’t have a big impact in terms of ranking or indexing or whatever, but it’s interesting to observe them.

And I would just love more of the geeks out there to join us and just, you know, play around and explore.

Jason: That goes for all aspects of SEO. The more we experiment and explore and share, the more our community learns, but also the more you learn about how we’re approaching this, and how you can help us to help ourselves.

Struggling with rendering issues?

Contact us for Rendering SEO services to address any bottlenecks that affect your search visibility.

 

 

Hi! I’m Bartosz, founder and Head of Innovation @ Onely. Thank you for trusting us with your valuable time and I hope that you found the answers to your questions in this blogpost.

In case you are still wondering how to exactly move forward with your organic growth – check out our services page and schedule a free discovery call where we will do all the heavylifting for you.

Hope to talk to you soon!