At Lumar we track over 250 metrics to help our users understand their website. In this guide, we will explain a bit more about how our systems work and how to store the information around your crawl data.
Explore Lumar Reports
Click on the links below to expand the relevant section and understand the metrics behind specific reports. All reports have been grouped into data sources.URLs and Pages | Links | Unique Links | Sitemaps
Below is more information on how Lumar calculates metrics and reports.
What are metrics in Lumar?
A metric is a piece of information about a page, link, or sitemap that we have extracted from a URL or has been calculated in our system (e.g. DeepRank).
Here are some examples of metrics we store around a URL:
- Title tag
- Meta robots tag
- HTTP Header
There are different levels of metrics that we have to calculate within the Lumar system.
For example, Meta Noindex is a low-level true or false metric that lets you know whether a page has the noindex meta tag. Indexable is a high-level metric that needs to take into account several metrics to be accurate (such as noindex tags, headers, canonicalization, etc.). All these different metrics, once calculated, let our system identify if a page is indexable or non-indexable.
For all pages fetched and processed in our system, we collect more than 300 metrics which include everything from a page’s title to the number of Search Console impressions.
What are reports in Lumar?
A report in Lumar is a combination of different metrics – while a metric is an individual piece of information about a page, a report takes many metrics and their values into account.
For example, the Page Title metric is the title that we extracted from your page, but the Short Titles report is a list of URLs that have a short title and are indexable.
Examples of reports in Lumar:
- Noindex pages: Pages that have a meta robots or X-robots noindex.
- Canonicalized pages: Pages whose canonical tag is not self-referencing.
- Primary pages: Indexable pages which are unique or the primary of a set of duplicates.
What are Lumar’s data sources?
During our crawls, we collect information about URLs, links between those URLs, and sitemaps. As these three pieces of data are so different from each other, we separate them into separate main databases.
Pages and URLs
This data source contains each URL and all metrics related to each URL. For example:
- Indexable pages
- Non-200 pages
- 301 Redirects
This data source contains each link and related metrics, for example:
- Source URL
- Target URL
- Orphaned pages
It also contains links that have issues. For example broken links, links between protocols, and a few other cases.
We do not currently store every single link and its source that we see during a crawl as this is typically terabytes of data. If you are interested in all links between pages, look at Unique Links.
This data source contains every unique link that we saw during the crawl. For example:
- Anchor text
- Target page data
- Primary sources
If your website has a navigation link to the homepage on every page of the website, then we will save that link once along with a count of the times we saw that link.
This data source includes Information about the sitemaps we processed during the crawl. For example:
- Broken/disallow sitemaps
- URL count in sitemaps
- Sitemap type