A crawl error occurs when a search engine crawler is unable to reach a page on a website — this will prevent the page from appearing within search results. These errors could be due to site-wide or individual URL errors and may arise for several reasons. Our SEO Office Hours Notes below cover how Google Search deals with crawl errors, along with best practice guidance from Google for dealing with crawl errors.
Server Security Plugins Can Cause Unauthorised Errors
If Google Search Console is reporting a large number of unauthorised errors, it might be caused by a server configuration which is blocking Googlebot.
Crawl Errors Priority Metric includes Mixture of Signals
The priority metric for crawl errors in search console is a mixture of pages being returned in search results, included in Sitemaps, and if it has internal links. The higher the priority are the ones Google thinks might have content which Google wants to index.
s Are Recrawled Periodically
Google will remember your 404 URLs for a long time, and periodically recrawl them to see if they are still 404. These will be reported in search console, but are perfectly fine.
Redirect Chains Slow Crawling
Redirect chains cause latency which can slow down crawling, particularly if there are more than 5 steps which will be rescheduled to be crawled later.
Only Disallowed Scripts Which Affect Content Are an Issue
Disallowed scripts which are flagged as errors are only an issue if they affect the displaying of content you want indexed, otherwise it’s OK to leave them disallowed.
Google Queues Large Volumes of New URLs
If Google discovers a part of your site with a large number of new URLs, it may queue the URLs, generate a Search Console error, but continue to crawl the queued URLs over an extended period.
Crawl Errors are not a Quality Metric
Google considers crawl errors to be technical response, and aren’t considered a quality metric which would impact rankings.
URL Issues Create Duplicate Pages
Duplicate URLs from inconsistent ordering, case inconstistency, and session IDs can be fixed with canonical tags if the issue is minor, but it still creates crawling issues if there are many instances.