Deepcrawl is now Lumar. Read more.
DeepcrawlはLumarになりました。 詳細はこちら

Ask the Expert: Michelle Robbins Answers Your Click Stream Data Questions

Lumar Webinars - Learn SEO and Digital Marketing

For our last DeepCrawl webinar, we were joined by Michelle Robbins, Vice President of Product & Innovation at AimClear, who sat down with DeepCrawl’s CAB Chairman Jon Myers to explore how to use your analytics and crawl data in order to understand user behaviour and optimise your content for increased conversions.

During the webinar with Michelle, so many brilliant questions were submitted by our audience and we have included Michelle’s insights into these in this post. Read on to find out what she has to say.


What is sizeable enough data for sequence prediction models?

This is really traffic dependent because if you’re working with a site that isn’t getting a lot of traffic it’s going to be difficult to get anything other than anecdotal information about what’s happening when you look at applying that traffic analysis toward a prediction model. It’s also important to note that you’re going to need a lot of data spread over a period of time.

There isn’t really a specific number, but if you’re only getting around 500 unique visits a month, for example, that’s incredibly static. You are also going to want to factor in how the data is segmented, which will depend on the type of vertical you are in, as well as seasonality, in order to collect the most accurate representation of how and when people have interacted with your site.


Do you recommend segmenting efforts for a site with over 10 million pages?

You would have to, as you would want to understand how people are arriving on the site and where they’re going within specific categories. This will help you to get meaningful data, because users are going to take different paths depending on what they are looking for. It will also depend on your site’s architecture.


Can you share some top tips for using for Python for SEO?

Scale is the first thing you have to address – do you have a scale problem where you’re working with sites that are so large that your biggest challenge is scaling your efforts? Because you’re going to want to utilize any form of automation, whether it’s Python or another tool, to fix the scale problem. However, this is not going to fix the marketing problem.

Another use-case would be if you have a site that has 3,000 images without alt tags, that’s not a problem that anyone should tackle manually, so using Python to classify and tag those is a perfect example.

However, when working on the content side, it can be a little difficult as if you haven’t got a great dataset or a well-trained model, then you could produce some really bad content using it.

Ultimately Python can help you to execute tasks faster. It really is addressing problems of scale, so if you don’t have scale problems you won’t need Python as a solution and you’re probably better off spending time fixing broken marketing and analytical issues.


What tools would you recommend for performing these tasks?

Talking specifically about getting into Python, I would recommend Python Anywhere, which is a great tool to get started with Python notebooks.

Having a good data set and data model is more important than any particular tool. When dealing with the amount and types of data that you’re going to be doing the experiments with, you’re going to need to be working outside of Google Analytics directly, and BigQuery can help here.

BigQuery, SQL and TextPad are the tools that I use most often, but it’s really going to depend on what your use case is. Most importantly though, you’re going to want to get the data right first, with a data model set up in a way that will allow you to start manipulating it and running the kinds of queries against it, and an out-of-the-box tool is not necessarily going to do this for you.


Can you share more on what has improved in Google Analytics and what functions people should focus on?

You’re going to really just want to dig into and start playing with it, because the biggest improvements are now being able to drag and drop segments to apply to specific metrics, as well as looking at specific events on pages within these segments.

The improvements provide the ability to more finely slice and dice what you’re looking at. Then, even more powerful than that, is the ability to connect to a database where you can run queries against it.


How do you take the path to conversion analysis data and start to relate that to work within SEO?

Being able to analyse the clickstream data is going to give you an insight into user behaviour and enable you to better optimize sites for conversion. One example of this is by looking at how users travel from page to page, which may inform you that they’re looking for something and are not able to find it. So, how can we help them to find what they’re looking for more quickly?

Once you have run these experiments, you will be able to go back and look at your architecture, linking structure and the content itself and think about how a typical user converts. From here, you can think about what you could set up programmatically to help users complete the desired goal. It’s basically an if-then; if they came from this path, then send them here, or change this link to that link.


Where do you start?

Instead of looking at it as optimizing every path, just pick one to optimize for, this is going to depend on the vertical and type of content on the site. For example, if you have multiple products or services, pick one place to start and one conversion point to optimize for.

Before doing anything, in respect to diving into clickstream analysis and prediction models, make sure that you’ve got your fundamentals right. Ensure you’ve done a really comprehensive crawl, you’ve fixed all of your technical debt and that the site is well optimized from a platform perspective.

Start by taking a look at the data and what it is telling us about problems users are facing, look at how users are interacting with different elements across the site. Review the site search to see if a number of searches are being made for specific terms that you could create more content for. After all of this has been fixed and optimized, run a new analysis and take another look at what users are telling you still needs to change.


How often should you be crawling and are there times where you definitely need to?

At a minimum, I would recommend crawling monthly. A lot of it is going to depend on how frequently your site is changing, because when there are a lot of different stakeholders managing different areas of a site it can be really tricky to keep track of. If you’re working on a CMS where people are constantly changing things and may not be aware of how these changes can break other things, you’re going to want to crawl more frequently.

If you perform a crawl and find a large number of problems, then you’re going to have to stage fixing those, as you can’t fix everything at once. Therefore, after every fix you should also re-crawl, even if it’s just crawling the areas where the changes have occurred. This will help you to become more informed about the technical health of your website.

We’d like to say another big thank you to Michelle for taking the time to answer all of these questions and providing her expert insights on this topic.


Get started with DeepCrawl

If you’re interested in learning about how DeepCrawl can help you to identify issues on your site which are impacting both search engines and users, while also assisting with your optimisation efforts, then why not get started with DeepCrawl?


Get the best digital marketing & SEO insights, straight to your inbox