Welcome to the tenth episode of Open Dialog, the podcast for collaborative SEOs and digital marketers. In each and every episode, we’ll be speaking with the best and brightest minds in SEO, digital marketing and beyond to find out how we can work more effectively, efficiently and productively with other teams, departments and clients.
In episode 10, DeepCrawl’s Sam Marsden spoke with Will Critchlow, who is the founder and CEO at Distilled, a digital marketing agency specialising in SEO consulting, creative content and digital PR. Will shared his experiences of developing an SEO testing tool, interpreting the information shared by Google and building trust with development teams.
What keeps you excited about SEO?
Will explained that the simple answer is he just keeps finding things that are fascinating and enjoys learning as much as he can about the different elements within SEO. Search has grown and changed at both a technical and societal level and while other elements ebb and flow around it, search has been a constant.
For example, the growth in referrals from social comes and goes, but search constantly remains the majority traffic driver for most websites. Along with this is the business effectiveness side, the fact that it is still the biggest commercial channel for most of the web.
You recently found some discrepancies in Google’s open-source robots.txt parser and built your own tool to overcome those differences. Can you tell us about that?
While preparing for SearchLove London this year, Will was developing some new material around common misconceptions and errors that people make in SEO. He ran some SEO quizzes on Twitter around these misconceptions and was surprised at the level of misunderstanding of a number of these.
One example of this was within robots.txt. Will explained that it’s easy to think that robots.txt files cascade because it looks like they should. For example, if you have some rules which apply to all user agents (user agent: *) and some rules which apply to a specific user agent (e.g user agent: Googlebot) it’s easy to think that Googlebot should follow all of those rules, both those that apply to Googlebot and the ones that apply to all user agents.
However, that’s not actually how robots.txt works, if there is a specific matching user agent defined, the crawler will only follow those rules and not any of the general ones. So you can think of the user agent: * as all of the user agents except the ones mentioned specifically elsewhere in the file. This often catches a lot of people out, as Will discovered when running the quiz on Twitter.
Then, while looking further into robots.txt, Will found that there are several areas of discrepancy between the documentation and the online robots.txt checker tool in GSC, which also differed to the newly released open-source parser. It’s a sentiment many SEOs feel; where Google says one thing, but the truth is actually a little different.
Google recently released the robots.txt open-source parser as a source code that could be compiled, and together with his colleague, Tom Anthony, Will attempted to learn how it worked and where the gaps were with the documentation. In the process, he identified a few areas of discrepancy and ended up building a web-based tool to replace the one in Search Console, which is currently incorrect.
The tool Will built is a modified version of a fork of the open-source robots.txt code which matches his understanding of actual Googlebot behaviour. He noted that there are still areas of confusion and misunderstanding when it comes to Googlebot’s behaviour of the robots.txt.
There are two things which Will would like to see Google do here, firstly it is to update the documentation as there are areas which are misleading, wrong and out of date. The other is to remove the links directing people to the old Search Console robots.txt tool, as this is currently incorrect.
Interpreting information from Google
The talk that Will gave before putting together his SearchLove presentation was about trying to understand how to interpret all of the information that comes out of Google. One example he gave was the ongoing subdomain vs subfolder discussion, where Will shared his understanding that Google is saying they do not have a feature detection element of the algorithm that says ‘if subdomain do this’, ‘if subfolder do that’.
In this sense, Google is agnostic to where you put a page, however, this finding is only emergent from updating a website with different forms of navigation to see if there is an improvement in performance, not because it has been confirmed that it is a case statement in Google’s algorithm.
While Will appreciates that it is not the Google Search Liaison team’s job to proactively tell SEOs all of the effects of what we do, he believes they could do more to prevent misinformation and share insights into how they have come to the conclusions they recommend.
A recent example of this was from the Webmaster Conference in Zurich, during John’s talk on category page pagination. The slide had a very exact recommendation around noindexing and canonicalizing certain pages.
Will explained that while this is helpful information to have, what he would like to know is how John came to the conclusion that this recommendation is the right thing to do. In the real world as SEOs, in order to come to this conclusion, we would run experiments and use data to understand what would work best.
However, we have no insight into how Google has concluded that this is the correct recommendation to give. Have they performed experiments and used data from across the web? Or are they recommending this based off of the way they want us to build websites, in order to be consistent with how nofollow and canonicalisation works? If the latter is the case it doesn’t necessarily mean that it is going to be the highest performing option.
Unless Google is performing experiments on real-world websites in order to reach these conclusions, Will believes there is a theoretical gap between what they are recommending and what will actually work in practice. There is a difference between this and when they clarify a specific ‘here’s how Google works in a specific case’ or when answering a yes or no question from the insights they have.
Is this grey area causing problems with communication between SEOs and Engineers/Developers and the ability to get recommendations actioned?
Will explained that along with the broad culture challenges faced, this grey area can cause friction between the SEO and development teams.
There are two sides to good communication with development teams when working to get recommendations implemented. The first is general trust, is the SEO team doing a good job, providing the correct information and communicating well, ultimately building the trust that their recommendations are worth being listened to?
The other is with specific cases and Will explained that he has come across many times where he has told the engineering team that an action is specifically worth undertaking and hearing back from them that the official line from Google states it’s fine the way it is currently being done.
This is one of the reasons Distilled have been investing so much into the SEO testing side of things with their ODN platform. They want to be able to provide data-driven recommendations and put things to the test.
Testing the benefits of FAQ schema
One such test Distilled has performed is a data-driven analysis to see if there are benefits to adding FAQ mark-up to a page. There has been a debate around whether it is beneficial to add these, but they have found the implementation to be successful for their clients, and this is backed by data from the testing. However, Will explained that while this has been successful for several of their clients, they can’t extrapolate this into assuming it will be a good recommendation for every website.
This specific case is interesting for several reasons. Firstly, there is a trade-off between providing information for users directly in the search results vs gaining the click through. With the FAQ accordions, there’s an obvious trade-off because you are providing more information, which may mean you do not receive the click-through, however, it might also lead to better brand sentiment. This is a conflict that is being played out in many other areas as Google enhances its search result features. For their client, allowing users to engage with these accordions lead to more people clicking through, even though they received some information before visiting the site.
The other interesting thing is that FAQ schema pushes out other structured mark-up types you have on a page, as Google will only display one mark-up type. In this case, there is no other way of discovering what is going to work best for a website without testing it, so Distilled also performs A/B style tests for this purpose.
In the past, they have run tests for different schema types including breadcrumb mark-up, after seeing a decrease in CTR following the implementation of this. One hypothesis for this happening is due to the way the final part of the breadcrumb link is displayed in search results.
For example, if you have a URL which contains ‘/mens/trousers/slimfit’ but you mark this up with breadcrumb schema, it may just display ‘mens/trousers’ and not show that the page is for slim fit trousers. Therefore, if your competitors have the raw URL without the mark-up, their results may look more relevant to searchers than yours does.
User Experience and Search Engine Algorithms
In Google’s Chrome User Experience Report, they have started to incorporate metrics that are not just related to speed, for example, cumulative layout shift which measures how much a page moves after loading. Now that these metrics live in the dataset is this something that is going to be incorporated into search algorithms?
Will expects that, due to more machine learning-based algorithms, we will begin to see a number of new metrics thrown into the mix. Theoretically, everything that Google measures is a ranking factor, as it is a feature that can be taken into account by the machine learning algorithm. However, whether it is a feature that has any signal, where the machine takes it into account for ranking purposes, is unknowable for us on the outside. The reason for this is because there is a possibility that some factors are not even looked at.
For example, if they measure other user experience signals and actual human users don’t like it when content shifts around as it loads, they may indirectly measure this effect as a negative output in other areas. Therefore it may be that changing it improves your performance in search, even though it wasn’t a feature directly impacting the machine learning algorithm. From the outside we have no way of knowing this, so Will’s suggestion with these things is to assume it is an input to the model and build your hypotheses from the first principals ‘this is better for our users, therefore we think it will be good for search’ and then test it.
This is another reason they have invested in a testing tool, as testing the impact of recommendations is the best way to operate in a machine learning search driven environment. Will’s recommendation is not to spend too long stressing whether a specific metric is a flag feature level detection in the algorithm. Instead spend more time thinking about how you can make your site perform better in search, ensuring it is aligned with Google’s mission to return the website’s that people like best. Then test that and form your own arguments based on conversion rate or user experience signals directly.
Using machine learning for understanding and ranking pages
As machine learning is propagated throughout search algorithms, the understanding of ranking factors is going to become even more blurred. Will explained that over his career in search, he has seen some significant shifts in understanding. Pre-Panda and Penguin era in 2008 he had a much better understanding of levers, where you could look at a website and say ‘if you do this it will definitely improve your performance’ with much more certainty. While this has shifted, as SEOs we still understand the right methodologies and paths to rolling out best practices, we just don’t know for certain how Google works and we can therefore only answer this with experimental outputs.
Together with this, computers are good at discovering things that, as humans, we have no name or explainable concept of and this is just as true in search. While there are a number of things which we have an understanding of, there are plenty of other things that a machine has discovered users love, that we as humans are not able to understand and have no way of explaining. This is the direction of travel that we have to get on board with and learn to operate within. It is going to lead to better search, an improved user experience and ultimately ensure the best websites are being found.
Being an SEO generalist
Will believes that as an SEO practitioner, it’s important to have some elements of generalism, not just an understanding of the technical foundations, but also of content strategy. It’s also important to get your website to a similar level, for example, you could have the perfect technical set-up, but if your content isn’t up to scratch you will not perform as well.
Equally, if you have great content but Google is unable to crawl and index the site then no one will see that content. The need for having both isn’t going anywhere, but at the same time, there is a need for specialists for the particular edge cases where you need that specific expertise.
Developments to the ODN platform
After a successful launch of their ODN software, Distilled has a set of clients that are customers of the software who are using it directly themselves to run tests on their website to answer questions. They also take results which they deem surprising, interesting or generalisable from the tests that are being run and publish external content around this. With their consulting clients, whose website may not be suitable for the testing they perform, Distilled work to ensure their recommendations are driven from what has been learned from the tests they are running. Sometimes the answers are still unclear, but they are able to provide insights into the outcome they have seen for similar websites when giving recommendations.
From the tests they have run, Will has seen there are several universal principles which seem to hold across every site, no matter the niche or vertical. They also found that there is nothing that is strongly true in one vertical which doesn’t hold in other verticals, aside from vertical-specific SERP features.
The biggest development to the ODN platform this year has been the full funnel which allows simultaneous testing of user experience, conversion rate and SEO factors, so you can see the impact each test has had on search performance and conversion rate. The newest update they are working on is building off of this full funnel to allow the platform to run conversion rate specific tests.
Be the first to hear about new episodes
A massive thank you to Will for being such a great guest and teaching us so much about her experiences working as a product owner. You can find more episodes of Open Dialog here on the DeepCrawl Blog and make sure to be the first to find out about new episodes by joining our mailing list.