Deepcrawl is now Lumar. Read more.
DeepcrawlはLumarになりました。 詳細はこちら

How to Identify Thin or Poor-Quality Content for Google Panda

SEO and Digital Marketing Best Practices

An SEO’s work is never done… just when you thought you were finished with Penguin audits, backlink audits and endless rounds of testing, it’s time to catch-up with Panda and make sure no thin or poor quality content has appeared on your sites.

In October 2015, Google’s Gary Illyes confirmed that Panda 4.2 is still rolling out after two and a half months and it’s expected to continue rolling out slowly for a some time yet.

A Panda

Why pander to the Panda?

In essence, thin content (or ‘shallow’ content as it’s sometimes called) is lacking in substance and will discourage human engagement with your site.

From Google’s point of view, thin content could mean duplicate or similar content (internally or externally) or pages with a high proportion of navigation/image/dynamic elements and not enough copy.

Panda is also designed to crack down on sites with too many blank pages, ad-stuffing and technical glitches that hinder a user’s experience.

If you suspect that you or your clients’ sites have indulged in this type of content, you’ll need to find it and get it off Google’s radar sharpish. And, by sharpish, we mean right now. As Glenn Gabe mentioned in his excellent Search Engine Watch Panda audit post, just because you might have recovered recently doesn’t mean that you won’t get hit again in the next update if your site continues in the way it has.

But, in order to remove it, first you’ll need to find it. As it happens, we know just the tool for the job…


Five steps to Panda perfection with DeepCrawl

Here’s how to use DeepCrawl to optimize any Panda audit:

1. Run a Universal Crawl for the full site

A Universal Crawl will crawl the site, XML Sitemaps and organic landing pages in a single crawl, and import Google Analytics data, to identify gaps in the site find every URL.

Make sure that you have Google Analytics integrated for additional engagement data to measure the quality of your pages.


2. Find low-quality sections of the site using Site Explorer

Use the Site Explorer report, setting the drop down to Analytics mode.

updates panda site explorer deepcrawl

The Average Bounce Rate, Time on Site and Page Views per Visit metrics are a great way to identify any low-quality sections of your site.

Any sections which don’t drive any organic visits aren’t adding any SEO value, so consider removing them from Google’s index altogether by noindexing or canonicalizing where appropriate.


3. Find thin pages using Site Speed mode

The Content Size option will show you all content that is regarded as ‘thin’ and that could cause a Panda penalty.


updates panda site explorer site speed deepcrawl

4. Find non-indexable pages with the Architecture mode

This will show you the sections that are already non-indexable and won’t be causing you any issues.

updates panda site explorer architecture mode deepcrawl

5. Make thin content non-indexable on your site

Remove thin or low-quality content from Google’s index to prevent search engine users from being able to land on it from a search result.

Add a noindex and/or canonicalize where appropriate (if you’re unsure which option to choose, use our guide to noindex, disallow and nofollow here).

You can then run another crawl to check that the changes you’ve made to the site have affected the site as you intended.


Ongoing checks: useful DeepCrawl reports

Use these reports to help identify areas of improvement for user experience:

  • Fix broken links with the Validation > Internal Broken Links and Validation > External Broken Links reports.

  • Fix slow pages with the Validation > Max Load Time reports.


Avatar image for Tristan Pirouz
Tristan Pirouz

Marketing Strategist

Tristan is an SEO enthusiast, strategist, and the former Head of Marketing at Lumar.


Get the best digital marketing & SEO insights, straight to your inbox