A URL with the same path repeated 3 times will not be indexed in Google
We discovered an undocumented situation where repeating the same pathname in a URL, which will prevent the URL from being indexed.
If you ever repeat a URL path more than twice, the URL will not be indexed. For example, this URL would not be indexed in Google.
Even if the repeated paths are broken up by another unique path, the URL will not be indexed. e.g.
This URL would not be indexed.
Why would you repeat the same path anyway?
Although it’s quite unusual, you might accidentally end up with a URL like this without realising.
One website we examined had a strange URL architecture which used fixed paths to store variables, and had a situation where this could occur. e.g.
Why does Google do this?
This is because Google thinks it has hit a URL trap.
URL traps occur most often when a relative link includes the same path as where the page is located. Relative URLs are added to the end of the paths of the URL which contains the link.
For example, if you had a page like example.com/path/page.html, which included a relative link back to itself using “/path/page1.html”, the actual URL of the link is example.com/path/path/page1.html. If this page is returned by the server, it will contain another relative link to “/path/page1.html”, which is actually the URL example.com/path/path/path/page1.html. And so ad infinitum.
Although this has never been documented to our knowledge, our tests have shown consistent results, and John Mueller appeared to confirm this when we reached out to him.
“I could see that kind of optimization sometimes making sense – there are a lot of sites that have infinite nested directories due to URL rewriting mistakes.”