How to Understand the “Thin Content” Update

As I wrote in my “How to diagnose Penguin 3.0″ post, the recent Penguin update confused many webmasters. Google did not only roll-out out a Google Penguin refresh, but they also rolled out a Google Panda, Pirate Update, and Thin Content update at nearly the same time. You cannot make it more complicated than that!

In this article, I would like to focus on a “second bump” from Algoroo.com – Thin Content Update on October 25th, 2014.

Before I continue, I just want to mention one more important thing: I decided to write this post, mostly because I’ve already seen too many misdiagnosed Penguin 3.0 cases. Focusing only on links, without fixing the issues that triggered the “Thin Content” algorithm, to hit your website is definitely a terrible solution that can tank your business.

Dejan Petrovic from DejanSEO asked an interesting question on Google Plus, which confirmed my findings.

Then, I had only one customer with a thin content issue that contacted me (it was only 4 – 5 days after webmasters noticed the drop). I assumed (as you can see above) that it was either a Panda update or Page Layout update. Back then, I wasn’t aware of the scale of this update or whether it almost purely targeted thin content. It was hard to make any assumptions based on a small part of the website… until this post about the thin content update written by Dan Petrovic, inspired by Martin Reed‘s findings, shed some light on the problem.

Just to summarize the findings from Martin’s post mentioned above. He mostly found an interesting case of thin content pages showing up in Google Webmaster Tools as soft 404s. Unfortunately, it only happened in some cases. I haven’t seen it yet and I have customers where 99% of their website’s content could be classified by Google as thin content.

Diagnosing “Thin Content” update hitting your website

Apart from looking at the October 25th drop, it is also worth checking a few other factors:

  • Text/HTML ratio
  • Duplicate content
  • Technical issues
  • Crawling issues
  • Redirect loops
  • Invalid sitemaps
  • many “no content pages”
  • Index bloats
  • No alts/names for images

Now, the list above may sound a little generic, so let me show you a few examples. In the past few days, I’ve had many clients contact me to ask for help regarding the October 25th drop.

Thin Content update case study

I’ve got a few interesting case studies, but I’d like to share one of the most interesting cases so far.

Authority eCommerce website’s case

Small eCommerce store from the USA, also offering expert-level services and really well known in the community. It has a totally clean link profile with ONLY natural links from the community. Many (10+) years on the market, never dropped in rankings before. Tons of brand searches (driving even more traffic than top keywords).

Main issues:

  • Very thin and technical product descriptions. Only a few different words between product pages.

I believe that Google will only look at the content unique for the page and ignore the “frame”. So if we’ve got duplicated content between product pages, Google will only consider the content that is unique per page.

A similar approach is one of the ways we use to diagnose eCommerce problems during SEO audits.

Let me show you an example of the product page in this store:

All the content on the left is duplicated across all product pages within one category. Pages differ between each other only by the content in header (Full name), price, and content on the right-hand side (mostly numbers or technical values).

This is enough for users who know this brand really well to make a purchase. However, for Google, it is almost a definition of a thin content page.

  • No friendly names/ALT tags for images

Images make up one of the biggest values of this website. Thousands of handmade and WYSIWYG images. Unfortunately, there was no way for Google to know anything about the content of those images.

With no ALT tags and with DSC_1234.jpg format, it was impossible for Google to recognize what was on each image. Therefore it was just another number on a page with almost no content.

  • Low Text/HTML ratio

There is no magical ratio here, and it is quite difficult for some websites (like mine) to achieve a result higher than ~15%. Still – the higher the %, the better.

In the case of this particular store, the ratio was extremely low (3 – 5%). Therefore, it was clearly a huge issue right there, working like a magnifying glass for Google to find a thin content issue.

Deep Crawl actually did a great job finding thin content issues.

Other interesting cases

1. Small company website

With only 101 unique pages generating thousands of duplicates and thin content pages, it got hit on October 25th, 2014.

What is interesting, there are no 404 errors in Google Webmaster Tools and no index bloat. Google simply didn’t index duplicated pages, but still, it got negatively impacted by the thin content update.

2. eCommerce store got hit on October 25th, 2014

I don’t think I need to comment on the Deep Crawl‘s screenshot below. It is just an overview of the main issues to fix. The website tanked totally on October 25th, 2014.

Summary

Thin content update seems to be one of cleaning up the leftovers after Panda. Fortunately, it is much easier to diagnose and fix as metrics targeted by Thin Content update are easier to measure than user experience.

When can your website recover after a Thin Content update?

The answer is simple! We don’t know exactly as Google never confirmed this update in any way, not to even mention informing webmasters about a possible recovery.

I can make an educated guess though, and probably I may not be right. However, my guess is that it will work just like all the other On and Off page Google algorithms. You need to fix all the issues, make sure that Google re-crawls your whole domain and then you can recover during next Thin Content update.

How to fix the issues?

I think it should be quite clear now for most SEOs, how to approach this problem and how to prioritize the changes necessary to recover the website.

I can strongly recommend Deep Crawl, as it is really good with simply highlighting thin content pages. But, unfortunately, we cannot rely only on Deep Crawl for recovering the website. It also requirers an in-depth Google core update review. Still, I believe that this issue is fairly easy to fix compared to, e.g., a Page Layout or Google Panda update.

If you have had any interesting cases, please share and comment below. I would love to hear your experiences with the Thin Content update!