Not Found (404) vs. Soft 404: Differences and How To Fix Them

not-found-404-vs-soft-404 - 0 not found 404 vs soft 404 heroimage

“Not found (404)” and “Soft 404” are Google Search Console statuses that may describe some of your unindexed pages. They are named very similarly, and both can profoundly hurt your SEO but have entirely different causes:

  • In the case of “Not found (404)”, Google cannot index your page because it responds with a 404 HTTP response code as its content doesn’t exist,
  • In the case of “Soft 404”, Google is confused by your page. Even though your server says the page is available, its content seems missing.

The way you approach “Not found (404) and “Soft 404” pages may be essential for your user experience, crawl budget optimization, and indexing strategy.

Let’s explore their differences further and learn how to address them.

What does the “Not found (404)” status actually mean?

If you see “Not found (404)” in the Page indexing (Index Coverage) report in Google Search Console, it means that:

  • Googlebot communicated with your server to retrieve a given page,
  • The server couldn’t find the requested URL, so it responded with the 404 HTTP response code.

Servers communicate with crawlers and browsers through status codes. Whenever you’re able to view a page without any problems, the server most likely responds to your browser’s request with the 200 status code.

An illustration of how server communicates with clients,

There are also many status codes referring to possible errors, because of which the server cannot grant you access to a page. The 404 status code is one of them. It means that the page is not available because the server couldn’t find it.

An illustration of how server communicates with clients

Google doesn’t index 404 pages because they present no value to users.

What Causes “Not found (404)” in Google Search Console?

However, the possible reasons why the server responds with the 404 status code may differ:

 

Removing a page

Managing a website, it may happen that you accidentally remove a page. If it’s a crucial page with many links pointing to it, it may contribute to a significant traffic loss for your website.

But you may also want to remove your content on purpose.

Here are a couple of reasons why you may want to do that:

  • Optimizing duplicate content that has no value for your business and users, and you don’t want to modify it.
  • Having orphan pages that don’t drive traffic to your website, but you can’t link to them or redirect them.
  • Addressing the out-of-stock product pages that no longer have search demand or backlinks and aren’t returning to your website.
  • Hiding content you unintentionally published on the production site, e.g., during a website migration.

There’s nothing wrong with removing the page that doesn’t bring business value to your website or may harm its SEO.

And as long as you can’t address your issues in any other way (e.g., modify or redirect your content), feel free to set up the 404 status code.

Changing the URL structure

Your website is constantly changing, so it’s normal for some URL addresses to change with time. 

But remember that if the link pointing to a page is incorrect, the server won’t provide users with the requested content because it can’t find it.

Another case is when you make a typo in a URL when manually adding links or typing to enter a given page.

Such mistakes may concern, e.g.,

  • words with alternate spellings (optimisation vs. optimization) or
  • adding spaces to a URL as they will be replaced by the %20 string (example.com/red-%20car)

The change may seem insignificant from your perspective. However, for search engine bots, even a minor difference in the URL address is interpreted as a different URL.

How do 404 errors impact SEO?

Even though having some “Not found (404)” pages is inevitable, leaving them unoptimized may contribute to further issues on your website.

Negative user experience

Most likely, no matter how users entered your “Not found (404)” URL, they weren’t looking for a blank page.

Seeing no content on a target page may create a negative user experience. And how users feel about your website directly influences your conversion rates.

Therefore, you need to ensure your visitors don’t feel lost when encountering 404 pages on your website.

A good practice is to create a custom 404 page that is not only visually attractive but, most of all, informs users:

  • Why they see the “Not found (404)” page, and
  • What further actions they may take on your website, e.g., read your top articles.

By creating a sound 404 page, you can encourage users to stay on your website even though they can’t explore the exact page they want to.

Learn how to create a custom 404 page for your website by reading my colleague’s article.

Wasting your crawl budget

Google doesn’t have infinite resources to crawl everything on the Web.

If bots can freely crawl your “Not found (404)” pages, they may never get to the more valuable pages on your site before your crawl budget is wasted.

If you think that might be your case, go for crawl budget optimization services to unlock your website’s full crawling potential.

Diminishing your traffic potential and ranking signals

If you have many internal and external links pointing to your 404 pages, the accumulated PageRank goes to waste. 

How to troubleshoot “Not found (404)”

First, browse the list of affected pages in the Page indexing (Index Coverage) report to check if they are the consequence of your deliberate decision.

How to find the "Not foud (404)" status in Google Search Console.

Also, if you manage a large website, navigating your 404 pages is easier with an SEO crawler like Screaming Frog or WebSite Auditor.

Another thing you need to check is to ensure your XML sitemap doesn’t include any “Not found (404)” pages.” You can filter your affected URLs in the upper left corner to ‘All submitted pages’ on the status page.

how-to-fix-not-found-404-in-google-search-console - 2 how to fix not found 404 in google search console

Ideally, as your sitemap file should only include pages responding with the 200 status code, you shouldn’t find any URLs on the list of ‘All submitted pages’ (or, as it was in the past – within the “Submitted URL Not found (404)” status.)

Otherwise, it may mean the following things:

  • You don’t want the page to be indexed any longer – you removed a submitted page but didn’t update the sitemap file, or you updated your sitemap, but it contains the error page anyway.

Ensure you update your sitemaps every time you make changes on your side.

And remember that even though you implemented your changes, they won’t be immediately picked up. Check your ‘All submitted pages’ report again when Google recrawls your sitemap.

  • You want the page to be indexed – you added your page to the sitemap but then removed the URL by mistake.
  • Your sitemap contains URLs you don’t care about to get indexed. In this case, follow the best practices of creating XML sitemaps for SEO, as such an approach may waste your crawl budget.

If you confirm that your “Not found (404) pages shouldn’t exist and they don’t contribute to other issues, you can ignore the “Not found (404) status.

However, if that’s not the case for you or you aren’t sure how the “Not found (404) URLs may affect your website, read on for further steps.

Set up 301 redirects

Consider redirecting your “Not found (404)” page when you:

  • Moved your content to another page that is semantically related,
  • Removed your page, but you have another page on your website that is related, and you want your users to head there,
  • Removed the page that used to deliver traffic or still has search demand for keywords that it targets, and
  • Have many internal and external links pointing to your “Not found (404)” so you can pass a given page’s authority.

In the perfect scenario, after the proper redirect (and after Google recrawled the URL), the “Not found (404)” page will change its status to “Page with redirect in Google Search Console.

However, remember that you shouldn’t rush to redirect your “Not found (404)” pages to contextually unrelated pages just for the sake of redirecting. Otherwise, it may contribute to other issues on your website, like the “Soft 404” error we’re going to discuss below.

Are you about to redirect your 404 page? Dive into our ultimate guide to redirects to explore the topic further and avoid possible redirect errors.

Monitor internal and external linking

When you think a given page shouldn’t exist so it’s correctly returning the 404 HTTP status code, ensure it isn’t extensively linked throughout your website and from external resources.

You can replace your internal linking to 404 pages with links to the related pages that respond with the 200 status code.

When it comes to external linking, you may contact the websites linking to you and ask them to update the no longer existing link. However, I understand that’s not always possible, especially if there are thousands of backlinks pointing to your page.

In this case, make a 301 redirect to an existing page (or consider creating new related content you can redirect to), or set up the 410 HTTP status code.

What does the “Soft 404” status actually mean?

Soft 404 doesn’t occur when the server responds with the 404 error. Google labels pages as soft 404s when they meet two conditions:

  • Their content seems to be missing but
  • The server still responds with a 200 status code. 

In other words, Google thinks that a given URL should return the 404 status code despite providing a 200 response. On this basis, it concludes that the page should not be indexed.

How Do I Fix “Soft 404” errors in Google Search Console?

You can find your pages affected by the “Soft 404” status in the Page Indexing report. It’s easy to access from the left navigation bar in your Google Search Console.

A screenshot showing where to find soft 404s in Google Search Console.

You can gain more information about those pages by clicking on the status name. It’ll show a graph presenting how the number of affected pages has changed over time and a list of URLs. You can export the list using the button located in the upper right corner.

Screenshots showing how to navigate the report on soft 404s.

According to what John Mueller said on SEO Office Hours in July 2021, Google Search Console reports only those soft 404 pages that are considered as such on mobile. If some desktop pages are labeled as soft 404, but their mobile versions aren’t affected by the problem, you may not be able to see them in GSC. 

To detect desktop soft 404s invisible in the GSC report, your website needs a technical SEO audit.

Let me guide you through the possible causes of the “Soft 404” status and ways to fix them. 

Ensure that non-existing pages return a 404 status code

Many websites provide custom 404 pages that, instead of only reporting the error, help users navigate to the information they need and encourage them to explore the domain. Sometimes the process gets messy when left unattended, and these pages return a 200 HTTP status code. 

It’s bad for your SEO because empty 200 pages make Google waste its crawl budget. The solution to this problem is to configure your server to return the correct status code for pages that don’t exist (even if they are customized) ‒ 404 Not Found.

Crawl budget waste is a key SEO challenge for all large websites. Onely’s crawl budget optimization services can help you understand and eradicate the problem.

Avoid redirecting to irrelevant pages

When faced with a lot of outdated or empty pages, you might be tempted to redirect them all to one universal place, such as your home page. However, this solution is not useful from the perspective of your website visitors. 

When encountering this type of redirect, Google may label it a soft 404. To solve this problem, adhere to stricter rules while creating redirects:

  • Keep your redirects thematically relevant,
  • When you can’t find another page corresponding to the user’s intent, set up a 404 page instead of redirecting.

Avoid pages with little or no content

A good example of a page with little or no content is an empty directory page on an eCommerce website with products frequently going in and out of stock. Google is likely to classify it as a soft 404.

Thin content pages aren’t helpful for your users and pose threats to your SEO, such as:

  • Wasting your crawl budget,
  • Convincing Google that your whole website lacks quality, which may discourage Google from crawling your website as often,
  • Lower rankings following a thin content manual action.

It’s best to prevent the indexing of pages with little or no content by using the noindex meta tag. It’s also a good idea to review your site architecture and consider which product categories don’t fulfill their purpose and aren’t needed.

Be careful of 404-like words

Google’s algorithms aren’t perfect and may misidentify a page if it contains words that usually appear on a typical 404 page. It might happen on, e.g., eCommerce websites when a product page uses terms like:

  • “out of stock,”
  • “product unavailable,” 
  • “we don’t deliver to your location.”

You can try to troubleshoot the “Soft 404” status by deleting those words or using neutral synonyms. 

Fix your rendering issues

Some content may not be visible to Google because it wasn’t able to render it. Such problems frequently occur when your robots.txt file blocks crawlers from accessing CSS or JavaScript files.

You can find out if Google renders your pages correctly by checking them in the URL Inspection Tool. All you have to do is click on the magnifying glass icon next to the selected URL from the “Soft 404” list.

A screenshot showing how to inspect a "Page with redirect" URL in Google Search Console.

To fix the problem, ensure Google has access to the resources necessary for rendering. Review your robots.txt file and make sure crawling of CSS and JavaScript is allowed.

Rendering errors can be more complex than just robots.txt chaos. Onely’s rendering SEO services will enable you to understand the source of the problem. Let’s get rid of your rendering troubles once and for all!

Explore other 4xx client error statuses in Google Search Console by reading our articles on:

“Blocked due to access forbidden (403),”

“Blocked due to unauthorized request (401),”  and

“Blocked due to other 4xx issue.”

Key takeaways

  1. No matter the reason behind it, when the server responds with the 404 status code, it means two things: it can’t find your page, and Google won’t be able to index it. 
  2. Meanwhile, the “Soft 404” page returns a 200 status code, but Google is convinced that the 404 error would be more suitable for it.  
  3. If you’re sure that a given page shouldn’t exist, consider setting up a 301 redirect to keep the traffic flow to another page and transfer the accumulated page authority.
  4. Create a custom 404 page to minimize the negative user experience and keep the visitors on your website.
  5. To troubleshoot the “Soft 404” status in Google Search Console, try:
    • Checking if your non-existent pages correctly return the 404 status code,
    • Fixing your irrelevant redirects,
    • Marking your thin content pages with the noindex tag,
    • Deleting words that may be misleading for Google,
    • Getting your rendering SEO in check.

These are effective solutions to the 404 error and the soft 404 issues, but none of them guarantee lasting results. To get rid of problems with indexing and crawl budget, contact Onely for a discovery call.