Indexing
In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.
Use Info: Query to See If a URL is Indexed
Use info: search operator with a URL to check if a specific page has been indexed.
Content in Iframes May be Indexed on the Embedding Page
Pages embedded within an iframe on another page may be indexed as content on the embedding page as it will be seen when the page is rendered. You can use X-Frame-Options to prevent browsers from embedding a page which Google will respect.
Internal Search Pages Should Not Be Indexable
Google recommends you block internal search from being indexed as will likely increase number of pages indexed for that site and can be be inefficient for crawling and indexing
Index Status in Search Console is Updated a Few Times a Week
The Index Status data in Search Console is updated 2-3 times a week.
Split up Sitemaps up to Identify Pages Indexed by Google
There is no way to get information on which specific URLs are indexed in Google. If you want to see what URLs have been indexed by Google, you can split the sitemap up into smaller parts. However, you shouldn’t focus on getting high numbers of URLs indexed, but more on the relevance of indexed pages and content.
Quality Algorithms are Used to Influence Crawling and Indexing Speed
Quality algorithms are used to influence other algorithms such as those which control crawling and indexing speed.
The unavailable_after Meta Tag Tells Google when to Drop URLs from the Index
If you know when a page will expire, you can use the unavailable_after meta tag to tell Google when they should remove a URL from the index without them having to be recrawled.
The URL Removal Tool Blocks Pages Appearing in SERPs but Doesn’t Prevent Indexing
The URL removal tool in Search Console to hide individual pages from appearing in search results but doesn’t stop them being indexed. You can also use to the tool to remove all URLs under a shared path. You shouldn’t use the tool for general maintenance, only for something critical you want removed as quickly as possible.
Soft 404 Pages May Be Indexed then Later Dropped Out of the Index
Google initially indexes pages which then might be classified as Soft 404 pages, and then drops them from the index when they have processed the content.
Sitemap Errors don’t Impact Rankings but can Slow Down Indexing
Sitemaps help Google improve crawling and indexing of sites. If a sitemap can’t be properly processed, Google may take longer to index pages as have to rely on normal crawling and indexing to find those pages.