Deepcrawl is now Lumar. Read more.
DeepcrawlはLumarになりました。 詳細はこちら

How to make AJAX applications crawlable

If you want to crawl a website with an AJAX application, you will need to use the AJAX crawling feature to allow Lumar to access the links and content on the site.

Note: Google stopped using the AJAX crawling scheme at the end of quarter 2 of 2018. If you are relying on the AJAX crawling scheme for Google to crawl and index dynamic content, then this is no longer supported. For more information on getting JavaScript websites crawled and indexed please read new Google Developers documentation.

In this guide we will be focusing on:

  1. What is AJAX?
  2. How does the AJAX crawling scheme work?
  3. How do I configure Lumar to crawl AJAX websites?

 

What is AJAX?

The term AJAX stands for Asynchronous JavaScript and XML. It is a technique used by developers to create interactive websites using: XML, HTML, CSS, and JavaScript.

AJAX allows developers to update content, when an event is triggered, without having to make the user reload a page. Although an AJAX website can create an excellent experience for users, it can cause server issues for search engines.

The main problems with AJAX websites are that:

  • They do not create unique URLs for a page, instead they create a # for a page.
  • The on-page content dynamically generates once a page has loaded.

Both problems stop search engines from crawling and indexing dynamic content on a website. To manage these problems, Google came up with the AJAX crawling scheme which allows search engines to crawl and index these types of websites.
 

How does the AJAX crawling scheme work?

A website that implements the AJAX crawling scheme provides search engine crawlers with a HTML snapshot of a dynamic page. The search engine is served an “ugly URL”, while the user is served with the dynamic “clean URL” of a web page.

For example:

  • Clean URL in the browser: https://www.ajaxexample.com/#!hello
  • Ugly URL for crawler: https://www.ajaxexample.com/?_escaped_fragment_=hello

For an overview on how to create HTML snapshots and the AJAX crawling scheme read the following official guide by Google.
 

How to configure AJAX crawling in Lumar

When configuring Lumar it is important to make sure that:

  1. The AJAX website supports the AJAX crawling scheme.
  2. The advanced settings in the Lumar project are updated.

 

Supporting the AJAX crawling scheme

Our team recommends following the AJAX crawling scheme instructions to implement it on an AJAX website. It is important to note that there are two ways to indicate the scheme: An AJAX website which has hashbang (#!) in the URL, and AJAX websites which do not include a hashbang (#!) in the URL.

Our team has provided further details below for each setup and how it impacts Lumar.

AJAX websites using hashbang URLs (#!)

For Lumar to crawl an AJAX website which has a hashbang in the URL it needs the following requirements:

  1. The AJAX crawling scheme is indicated on clean URLs using hasbang (#!).
  2. The site’s server should be setup to handle requests for ugly URLs.
  3. The ugly URL should contain the HTML snapshot of the page.

If these requirements are not met, then Lumar will be unable to crawl an AJAX website.

AJAX websites without hashbang URLs (#!)

For Lumar to crawl an AJAX website without a hashbang in the URL it needs the following requirements:

  1. AJAX crawling scheme is indicated on clean URLs using meta fragment tag.
  2. The _escape_fragment_ parameter is appended to the end of clean URLs.
  3. The ugly URL should contain the HTML snapshot of the page.

The meta fragment tag and _escape_fragment_ parameter only need to be included on pages that are using AJAX. They do not need to be added to every page of a website, unless all pages use AJAX to load content.
 

Updating Lumar settings to crawl AJAX websites

Once the website is updated to properly indicate the AJAX crawling scheme to search engines, you will need to update the ‘URL Rewriting’ settings in ‘Advanced Settings’.

1. Firstly, uncheck the ‘Strip # Fragments from all URLs’ box. This will force Lumar to crawl the hashed URLs.

How to make AJAX applications crawlable in Lumar

2. Then create the following three rules in the URL rewriting settings.

Match From Match To Case Options
#! ?_escaped_fragment_= No Change
^(?!.*?_escaped_fragment_=)(.*?.*) $1&_escaped_fragment_= No Change
^(?!.*?_escaped_fragment_=)(.*) $1?_escaped_fragment_= No Change

How to make AJAX applications crawlable in Lumar

3. Hit “Save” at the bottom of the advanced settings in the project to save the rewrite settings.

How to make AJAX applications crawlable in Lumar

4. Run a test crawl to see if the new project settings are working.
 

How does the URL rewrite rule work?

The first rewrite rule will replace #! with _escaped_fragment_= in all URLs, for example:

Pretty URL: www.example.com/document#!resource_1
Rewritten URL: www.example.com/document?_escaped_fragment_=resource_1

This will allow our crawler to access the HTML snapshot of the page.

The second rewrite rule will append the escaped fragment onto the end of a URL that contains parameters, for example:

Pretty URL: www.example.com/document/?resource_1
Rewritten URL: www.example.com/document/?resource_1&_escaped_fragment_=

The third rewrite rule will append the escaped fragment onto the end of a URL that does not contain parameters, for example:

Pretty URL: www.example.com/document/resource_1
Rewritten URL: www.example.com/document/resource_1?_escaped_fragment_=

These rules will also allow for links on the website, which already contain the ‘?_escaped_fragment_’, in which case the solution should not be appended.
 

Frequently Asked Questions

Why does Google no longer support the AJAX crawling scheme?

Google stopped officially crawling #! URLs in the summer of 2018. They have stopped supporting this scheme as Googlebot can now render AJAX websites using the web rendering service (WRS).

Do any other search engines support the AJAX crawling scheme?

At the time of writing this guide, Bing and Yandex still support the AJAX crawling scheme. Neither of these search engines have announced plans to stop supporting the AJAX crawling scheme.

How long will Lumar support the AJAX crawling scheme?

The Lumar team does not have any plans to deprecate this feature. For any updates please follow our blog.

What should I do if Lumar is not crawling our AJAX website?

If Lumar will not crawl your website, even with AJAX crawling enabled, then we recommend checking the following:

  • Ugly URLs are not blocked in the robots.txt
  • Ugly URLs are producing a 200 http status code
  • Ugly URLs include links to other pages on the website and are navigable

We’d also recommend reading our debugging blocked crawls and crawling issue guides.
 

AJAX crawling feedback

If you have any further questions about AJAX crawling then please get in touch.

Avatar image for Adam Gent
Adam Gent

Product Manager & SEO Professional

Search Engine Optimization (SEO) professional with over 8 years’ experience in the search marketing industry. I have worked with a range of client campaigns over the years, from small and medium-sized enterprises to FTSE 100 global high-street brands.

Newsletter

Get the best digital marketing & SEO insights, straight to your inbox