What are SEO crawlers? SEO crawlers are tools that crawl pages of a website much like search engine crawlers do in order to gain valuable SEO information. A good SEO crawler is an indispensable tool and will inevitably make SEO work much easier and less time-consuming.
These are the SEO crawlers reviewed in this article (click to jump to them!):
- Screaming Frog
- WebSite Auditor
- Jet Octopus
- Netpeak Spider
Here are additional SEO crawlers not reviewed in this article:
- Raven Tools
- Searchmetrics Crawler
- IIS Site Analysis Web Crawler (a free tool)
- Xenu’s Link Sleuth (a free tool)
- BeamUsUp (a free tool)
- SEOSpyder by Mobilio Development
The universe is a big and impressive place, but we live in a limited world. So, I was able to test just 15, and for that, I would like to apologize. (But trust me, even this has consumed a large amount of time!)
Two Types of SEO Crawlers
You should probably know that there are two types of crawlers: desktop and cloud-based.
Desktop SEO crawlers
These are crawlers that you install on your computer. Examples are Screaming Frog, Sitebulb, Link Assistant’s WebSite Auditor, and NetPeak Spider. These are much cheaper than cloud crawlers but they have some drawbacks, such as:
- Crawls consume your memory and CPU. However, the situation is much better than it used to be in that crawlers are improving in the areas of memory & CPU management.
- You have to use proxies to avoid getting banned.
- Collaboration is limited. You can’t just share a report with a client/colleague. You can, however, work around this by sending them a file with a crawl project.
- Unfortunately, desktop crawlers struggle with crawl comparison (Sitebulb is an exception) and scheduling.
- In general, desktop crawlers are more limited than cloud crawlers.
At Onely, we run desktop crawls using a server with 8 cores with 32 GB RAM. Even with a configuration like that, it’s common for us to have to stop crawls because we’re running out of memory. That’s one reason why we use cloud crawlers too.
Cloud SEO crawlers
- Most cloud-based crawlers are decent in terms of collaboration. Usually, you can grant access to the crawl results to a colleague/client. Some of the cloud crawlers even allow for sharing individual reports.
- It’s common to get dedicated, live support.
- For the most part, you can easily notice changes across various crawls.
- Generally, cloud-based crawlers are more powerful than desktop ones.
- Typically, they are pretty good in terms of data visualization.
- Of course, this comes at a cost. Cloud crawlers are much more expensive than desktop ones!
Okay. Let’s get started!
What was tested?
|Basic SEO reports|
|List of indexable/ non-indexable pages||It’s necessary to view a list of indexable/ non-indexable pages to make sure there are no mistakes. Maybe some URLs were intended to be indexable?|
|Missing title tags||Meta titles are an important part of SEO audits. A crawler should show you a list of pages that have missing tags.|
|Filtering URLs by status code (3xx, 4xx, 5xx)||When you perform an SEO audit, it’s necessary to filter URLs by status code. How many URLs are not found (404)? How many URLs are redirected (301)?|
|List of Hx tags||“Google looks at the Hx headers to understand the structure of the text on a page better.” – John Mueller|
|View internal nofollow links||Seeing an internal nofollow list allows you to make sure there aren’t any mistakes.|
|External links list (outbound external)||A crawler should allow you to analyze both internal and external outbound links.|
|Link rel=”next” (to indicate a pagination series)||When you perform an SEO audit, you should analyze if the pagination series are implemented properly.|
|Hreflang tags||Hreflang tags are the foundations of international SEO, so a crawler should recognize them to let you point to hreflang-related issues.|
|Canonical tags||Every SEO crawler should inform you about canonical tags to let you spot indexing issues.|
|Information about crawl depth – number of clicks from a homepage||Additional information about crawl depth can give you an overview of the structure of your website. If an important page isn’t accessible within a few clicks from a homepage, it may indicate poor website structure.|
|List of empty/thin pages||A large number of thin pages can negatively affect your SEO efforts. A crawler should report them.|
|Duplicate content reports||A crawler should give you at least basic information on duplicates across your website.|
|A detailed report for a given URL||It’s must-have! If you do a crawl, you may want to see internal links pointing to a particular URL, to see headers, canonical tags, etc.|
|Advanced URL filtering for reporting – using regular expressions and modifiers like “contains,” “start with,” “end with.”||I can’t imagine my SEO life without a feature like this. It’s common that I need to see only URLs that end with “.html” or those which contain a product ID. A crawler must allow for filtering.|
|Page categorizing||Some crawlers offer the possibility to categorize crawled pages (blog, product pages, etc.) and see some reports dedicated to specific categories of pages.|
|Adding additional columns to a report||This is also a crucial feature. When I view a single report, I want to add additional columns to get the most out of the data. Fortunately, most crawlers allow this.|
|Filtering URLs by type (HTML, CSS, JS, PDF, etc.)||Crawlers visit resources of various types (HTML, PDF, JPG). But usually, you want to review only HTML files. A crawler should support this.|
|Overview – a list of all issues on a single dashboard||It’s a positive if a crawler lists all the detected issues on a single dashboard. Of course, it will not do the job for you, but it can make SEO audits easier and more efficient.|
|Comparing to a previous crawl||When you work on a website for a long time, it’s important to compare crawls that were done before and after any changes.|
|List mode – crawl just the listed URLs||This feature can help you if you want to perform a quick crawl of a small set of URLs.|
|Changing the user agent||Sometimes it’s necessary to change the user agent, for example, if a website blocks Ahrefs. This way you can still perform a crawl. Also, more websites detect Googlebot by the user agent and serve it a pre-rendered version instead of fully-equipped JS.|
|Crawl speed adjusting||You should be able to set a crawl speed, i.e 1-3 URLs per second if a website can’t handle the host load, while you may want to crawl much faster if a website is healthy.|
|Can I limit crawling? Crawl depth, max number of URLs||Many websites have millions of URLs. It may be better to limit the crawl depth or specify a max number of URLs.|
|Analyzing a domain protected by an htaccess Login||This is a helpful feature if you want to crawl the staging website.|
|Can I exclude particular subdomains, include only specific directories?|
|List mode||Sometimes you want to perform a quick audit of a specified set of URLs without crawling the whole website.|
|Universal crawl -> crawl + list mode + sitemap|
|Crawl scheduling||It’s handy to be able to schedule a crawl and set monthly/weekly crawls.|
|Indicating the crawling progress||If you deal with big websites, you should be able to see the current status of a crawl. Will you wait a few hours, or weeks until the 1M+ crawl is finished?|
|Robots.txt monitoring||Accidental changes in robots.txt can cause Google to not be able to read and index your content. It’s beneficial if a crawler detects changes in Robots.txt and informs you.|
|Crawl data retention||It’s helpful if a crawler can store results for a long period of time.|
|Notifications – crawl finished||A crawler should inform you when a crawl is done (desktop notification/email).|
|Advanced SEO reports|
|List of pages with less than x links incoming||If there are no internal links pointing to a page, it may mean that the page is probably irrelevant for Google. It’s crucial to spot orphan URLs.|
|Comparison of URLs found in sitemaps and in a crawl.||Sitemaps should contain all the valuable URLs. If some pages are not included in a sitemap, it can cause issues with crawling and indexing by Google. If a URL is apparent in a sitemap, but can’t be accessible through crawl, it may be a signal to Google that a page is not relevant.|
|Internal Page Rank value||Although PageRank calculations can’t reflect Google’s link graph, it’s still an important feature. Imagine you want to see the most important URLs based on links. Then you should sort URLs by not only simple metrics like the number of inlinks, but also by internal PageRank. You think Google doesn’t use PageRank anymore? http://www.seobythesea.com/2018/04/pagerank-updated/|
|Mobile Audit||In mobile-first indexing, it’s necessary to perform a content parity audit between the mobile and desktop versions of your website|
|Additional SEO reports|
|List of malformed URLs|
|List of URLs with parameters|
|Redirect chains report||Nobody likes redirect chains. Not users, not search engines. A crawler should report any redirect chains to let you decide if it’s worth fixing.|
|Website speed reports||Performance is becoming more important both for users and SEO. So crawlers should present reports related to performance.|
|List of URLs blocked by robots.txt||It happens that a webmaster mistakenly prevents Google from crawling a particular set of pages. As an SEO, you should review the list of URLs blocked by robots.txt to make sure there are no mistakes.|
|Exporting to excel/CSV?||Sometimes a crawler has no power here and you need to export the data and edit it in Excel/other tools.|
|Creating custom reports/dashboards|
|Sharing individual reports||Let’s say that you want to share a report related to 404s with your developers. Does the crawler support it?|
|Granting access to a crawl to another person||It’s pretty common that two or more people work on the same SEO audit. Thanks to report sharing, you can work simultaneously.|
|Explanation of the issues – why and how to fix||If you are new to SEO, you will appreciate the explanation of the issues that many crawlers provide.|
|Custom extraction||A crawler should let you perform a custom extraction to enrich your crawl. For instance, while auditing an e-commerce website, you should be able to scrape information about product availability and price.|
|Can a crawler detect a unique part that is not a part of the template?||It’s valuable if a crawler lets you analyze only the unique part of a page (excluding navigation links, sidebars, and footer).|
|Integration with other tools||It’s helpful if a crawler integrates with external tools such as Google Analytics, Google Search Console, backlinks tools (Ahrefs, Majestic SEO), and with server logs.|
|Why users should use the crawler||Here, I am getting direct statements from the crawlers’ representatives.|
- When I need to see the screenshot of a rendered view, I use SF (currently, it’s the only tool that supports this feature).
- If I want to start a quick crawl with real-time preview, I use Screaming Frog.
- When I am running out of credits in the cloud tool, I simply use a desktop crawler like Screaming Frog, WebSite Auditor, or SiteBulb.
- For now, Screaming Frog and Sitebulb are better in spotting redirect chains than most of the premium tools.
The latest versions of their software (v10 and v11) have brought many benefits. To name a few:
- Schema.org data validation
- Structure visualizations
- Crawling XML sitemaps
- Calculating the internal page rank
- Full command-line interface to manage crawls
- Better exporting
- Reporting canonical chains
- AMP crawling & validation
- Scheduling. You can schedule crawl (daily/weekly/monthly) and set auto exporting. It’s a big step forward, but I am looking forward to the ability to easily compare the data between crawls.
Tip: when you do a crawl, don’t forget to enable a post-crawl analysis, which will allow you to get the most out of the data.
As I mentioned earlier, Screaming Frog now offers visualization of links. You can choose one of two types of visualizations (crawl three and directory three). Both are valuable for SEO audits. The first can show you groups of pages and how are they connected. The latter can help you understand the structure of URLs on a website.
Pricing: £149.00 Per Year per single license lasting 1 year.
Checklist for Screaming Frog.
Sitebulb is a relatively new tool on the market, but it has been warmly received by the SEO community. Personally, I really like Sitebulb’s visualizations:
Because of the fact that Sitebulb is desktop software, you can’t just share a report with your colleagues while doing an SEO audit. You can partially work around this by exporting a report to PDF. Once you click on the “Export” button, you will see a 40-page document, full of charts, presenting the most important insights.
Sitebulb’s pricing strategy
Although a single license is more expensive than Screaming Frog’s, it has a nice pricing strategy. Every additional license cost you only 10% of the full price. Assuming both you and your colleague have Sitebulb installed on your personal computer, you can work on the crawl at the same time. Here is a guide on how to copy crawls across Sitebulb instances: https://sitebulb.com/documentation/audits-projects/importing-exporting-audits/.
There is a really interesting feature of Sitebulb: crawl maps. These can help you understand your website structure, discover internal link flow, and spot groups of orphan pages.
The second version of Sitebulb (released in April 2018) brought many interesting features:
- Statistics like First Meaningful Paint (helpful for website speed optimization)
- List mode (like in Screaming Frog)
- Code coverage report (unused CSS, JS)
- Multi-level filtering, like in Ryte, Botify, OnCrawl, and DeepCrawl.
- AMP validation.
- Sitebulb is the only desktop crawler that can compare crawls between crawls
Sitebulb integrates with Google Analytics and Google Search Console. Although Sitebulb does a great job with data visualization and offers many interesting features, I have to point out the drawbacks.
- Unfortunately, you can’t set custom extraction. Other tools support this feature.
- Sitebulb doesn’t inform about H2 tags.
- As a Big Data fan, I am not happy you can’t export all internal links to a CSV/Excel file. Screaming Frog offers it. However, I can see summaries and visualizations; that’s enough for more than 95% of SEOs.
- If Sitebulb encounters an error while retrieving a page, it will not be recrawled.
- I can do only one crawl at a time; other crawls are added to the queue.
I believe in the case of Sitebulb the pros outweigh the cons. By the way, you can suggest your own ideas by submitting them through https://features.sitebulb.com/. It seems many interesting features like crawl scheduling, and data scraping are going to be implemented. I’m keeping my fingers crossed for the project. Pricing:
By visiting https://sitebulb.com/onely you can get an exclusive offer, a 60-day free trial.
Checklist for Sitebulb.
WebSite Auditor informs about SEO stuff like status codes, click depth, incoming/outcoming links, redirects, 404 pages, word count, canonicals, and pages restricted from indexing. It integrates with Google Search Console and Google Analytics. As with Screaming Frog, for every URL you can see a list of inlinks (including their anchors and source). Also, you can easily export them in bulk.
Website structure visualization
Similarly to Sitebulb, with WebSite Auditor you can visualize the internal structure: Click depth, Internal Page Rank, and Pageviews (available through integration with Google Analytics).
Sitebulb, FandangoSEO, and WebSite Auditor are the only crawlers on the market that are capable of doing this. Content analysis WebSite Auditor provides a module dedicated to basic content analysis. It checks if targeted keywords are apparent in the title, body, and headers. In addition, WebSite Auditor evaluates TF-IDF (term frequency-inverse document frequency). If you’re not sure what this is, you can read Bartosz Góralewicz’s article “The TF*IDF Algorithm Explained.”
WebSite Auditor’s unique function is the ability to look into Google index to see orphan pages.
To do this, you have tick the “Search for orphan pages’ option while setting up a crawl.
Now, it’s time to point out WebSite Auditor’s main drawbacks:
- You can’t limit the number of URLs to be crawled, however, you can specify a maximum depth
- You can’t compare the data between different crawls
- Although WebSite Auditor supports advanced filtering for reports, it doesn’t support regular expressions
WebSite Auditor offers three versions: Free (allows for crawling up to 500 URLs), Pro, and Enterprise. You can compare the differences here: https://www.link-assistant.com/website-auditor/comparison.html If you use our referral links at WebSite Auditor Enterprise or WebSite Auditor Professional, you will get 10% off at checkout. Checklist for WebSite Auditor.
Netpeak Spider was not written up in the initial release of the Ultimate Guide to SEO crawlers, however, the list of improvements introduced in the recently released versions is quite impressive, so I just had to test it.
First of all, according to Netpeak’s representatives, Netpeak Spider 3.0 consumes ~4 times less memory when compared to the previous (2.1) version.
Other improvements introduced in Netpeak Crawler include:
- You can use a custom segmentation (I’ll explain this later).
- You can pause a crawl and resume it later or run it on another computer. For instance, if you see a crawl consumes too much RAM, you can pause it and move the files to a machine with a bigger capacity.
- You can rescan a list of URLs to see if any issues were fixed correctly.
- Netpeak Spider added a dashboard that shows the most important insights.
- NetPeak shows the list of the most popular URL segments.
- You can remove URLs from a report or rescan them.
If you want to read more about the recent updates, introduced in Netpeak Spider 3.0, here you go. Custom Segmentation Let’s start with the most important improvement (at least from my point of view) – data segmentation. Netpeak Spider is so far the only desktop crawler that has implemented it. What is this? This is a feature that lets you quickly define some segments (clusters of pages) and see reports related to these segments only.
Custom segmentation is definitely a great feature, however, I miss the ability to see a segment overview report like those offered by Botify, FandangoSEO, and OnCrawl. In the screenshot from FandangoSEO below, you can see the pagetype breakdown when viewing the dashboard, which provides a great overview of segments.
In the past, navigating through the list of issues was difficult but now, it’s much easier since you’ve got a treeview.
It’s time to move to the drawbacks of Netpeak Spider:
- You can’t integrate with Google Analytics or Google Search Console (although it’s planned for NetPeak Spider 3.1)
- Although the latest version introduced a visual Dashboard (which is fine), it still struggles with data visualization. I hope they will catch up shortly.
- If you’re a Linux user, you can’t use Netpeak Spider. For now, it’s available for Windows and Mac OS, however, according to their website, a Linux version is coming soon.
If you buy Netpeak Spider for 12 months, it costs $9.80 per month per single license. Go to our affiliate link and use the promo code: ca480e7f to get a 10% discount for one year on purchasing Netpeak Spider and Netpeak Checker!
Let’s move on to the cloud crawlers: DeepCrawl, OnCrawl, Ryte, and Botify. Disclaimer: for everyday routines, we use DeepCrawl and Ryte. We did our best to be as unbiased as possible. The crawlers are presented alphabetically.
Botify is an enterprise-level crawler. Its client list is impressive: Airbnb, Zalando, Gumtree, Dailymotion. Botify offers many interesting features. I think it’s the most complex, but also the most expensive of all crawlers listed. I noticed one disadvantage of Botify – it doesn’t offer a list of SEO issues on a single dashboard. If you open Ryte, Sitebulb, or DeepCrawl, you will see all the detected SEO issues (Internal Nofollow links, indexable pages with long click path, pages marked as “noindex, nofollow”) listed on one dashboard.
It’s my feeling that their developers will introduce this feature shortly. If they do, I will update this article. Botify has the ability to filter reports and dashboards by segments:
Let’s imagine you have three sections on your website: /blog, /products, and /news. Using Botify, you can easily filter reports to see data related only to product pages. Botify provides some reporting divided by groups. A few examples are presented below:
There is another useful feature on Botify that other crawlers simply miss. For every filter, you can see a dedicated chart (there are 35 charts in the library across several categories). This is pretty impressive. See the screencast I recorded. http://take.ms/TPCUi Also, you can install the Botify addon for Chrome and see insights directly from the browser. Just navigate to a particular subpage of a crawled website and you will see:
- Basic crawl stats
- A sample of internal inlinks
- URLs with duplicated metadata (description, H1 tags)
- URLs with duplicated content
Botify stores HTML code for every crawled page. It allows for checking content changes across crawls.
Checklist for Botify.
DeepCrawl is a popular, cloud-based crawler. At Onely, we use it during our normal routines (along with Ryte and Screaming Frog). We really like this tool, but one of the biggest drawbacks of DeepCrawl is that you can’t just add additional columns to a report. Let’s say I am viewing a report dedicated to status codes and then I would like to see some additional data: canonical tags. I simply can’t do it in DeepCrawl. If I want to see canonicals, I have to switch to the canonical report. For me, it’s an important feature. However, I am pretty sure they will catch up shortly. And if they do, I will update the article… I do believe that in the case of DeepCrawl, the pros outweigh the cons. There are plenty of interesting features of DeepCrawl:
- Logfile integration
- Integration with Majestic SEO
- Integration with Zapier
- Stealth mode (the user agent, the IP address is randomized within a crawl; helpful for crawling websites with restricted crawling policy).
- Integration with Google Search Console and Google Analytics
- Crawl scheduling
I mentioned above that DeepCrawl integrates with Majestic SEO. Furthermore, you don’t need to have a Majestic account to use it. Nice! DeepCrawl offers a few plans:
DeepCrawl has offered a discount voucher code to get 10% off any annual package by using the code: ONELY
Checklist for DeepCrawl.
OnCrawl is a cloud-based tool. While this tool is suited for bigger companies, it offers a starter plan at a reasonable price. You can crawl up to 10k URLs per month (up to 5 projects) paying 10 euros per month.
A lot of SEOs appreciate OnCrawl because of the near-duplicate detection feature – you can filter the list of URLs by a similarity ratio. There is another great feature of OnCrawl that other crawlers miss – you can integrate OnCrawl with any data. Just upload a CSV file with any data you want and make sure that your CSV contains the common field: “URL” and the sky’s the limit. Note: Botify offers a similar feature for some of their clients, but they don’t do it at scale, and recently FandangoSEO added such a feature. I like OnCrawl for its URL segmentation. Let’s say you view a list of non-indexed URLs. Then, you can quickly switch URL segmentation to see only the blog or product pages.
The recent version of OnCrawl brings some hreflang improvements:
OnCrawl gives you interesting reports regarding your page groups:
It also provides an overview of the link flow between page groups:
OnCrawl integrates with Google Analytics and Google Search Console. As with every cloud-based crawler, it allows for crawl scheduling. OnCrawl provides some pre-defined SEO reports, but its power is in its flexibility. You can create your own dashboards. Go to Tools -> Dashboard builder and click on the category you are interested in. As of 2nd May, there are 24 categories to choose from. Examples are Status codes, Indexability, Inlinks, Orphan pages, etc.
You can easily add or remove charts to a custom dashboard. OnCrawl provides a library of charts to choose from and a drag and drop to a specific custom dashboard.
If you ask me about OnCrawl’s drawbacks, it lacks the ability to filter crawled URLs by regular expressions. Also, Oncrawl doesn’t provide the list of detected SEO issues by default. You can work around this by clicking on the Dashboard builder -> Onsite Issues.
While using OnCrawl, I had UX issues with finding particular reports/dashboards, but they are there. OnCrawl is a quite powerful crawler, but it is difficult to digest. OnCrawl’s price depends on if you want to use the Logfile analysis feature. The price list (without log file analysis):
The price list (with the log file analyzing feature):
OnCrawl has created a unique coupon for Onely readers: “Onely-OnCrawlTR2019”. This coupon will give users a 15% discount on any subscription and is valid until December 31, 2019.
Checklist for OnCrawl.
Main competitors: DeepCrawl, OnCrawl, Botify, Audisto, JetOctopus, FandangoSEO, ContentKing Ryte is another popular web-based crawler. We use it during our everyday routine (along with DeepCrawl and Screaming Frog).
Good to see that we are listed on their partner’s list! I really like the reports generated by Ryte. On a single dashboard, I can see a list of all the detected SEO issues. Then I can click to see the detailed view and decide if it’s a real issue or if Ryte just wants to draw my attention to something. Of course, this report can’t replace human intervention, but it’s great having such a feature available. As with its main competitors, Ryte integrates with Google Search Console and Google Analytics.
Ryte’s unique function is the uptime server monitoring (they ping your server from time to time to ensure the server works well). Another interesting function is the robots.txt monitoring. Ryte detects if you change robots.txt and lets you review the history.
What is more, Ryte has a comfortable credits policy – if you want to re-run an active crawl, they will not charge you for it. OK, let’s move to the drawbacks. I commonly deal with big crawls, 500K+/1M+ URLs and sometimes I need to export particular reports to CSV. Until recently CSV export was limited to 30K rows. Fortunately enough, they recently expanded it and now it’s possible to export 100K rows. And if you use their API, the sky is the limit. To get you onboarded, Ryte provides webinars.
Ryte offers different pricing plans depending on your needs:
Checklist for Ryte.
Audisto is a crawler popular mainly in German-speaking countries.
Using Audisto, you can split lists of hints by category, like Quality, Canonical, Hreflang, or Ranking.
I really like Audisto’s segmentation. You can create URL clusters based on filters and see reports and charts related only to those clusters.
Many crawlers offering this a feature require you to have knowledge about Regular expressions. Audisto is a bit different in that; you can define patterns in the same way you define “traditional” filters. Additionally, you can even add comments when adding a cluster, which may be helpful for future reviews or when many people work on the same crawl.
However, you can’t apply segment filtering for all reports. For instance, you can’t do it for a Duplicate Content report or Hreflang report. With Audisto you can easily compare two different crawls.
Bot vs User Experience
Audisto has a nice approach to bot vs user experience. They detect if users get a similar experience as Googlebot and even provide a chart to visualize the comparative experience.
For every issue listed in the Hint section (Current monitoring -> Onpage -> Hints) you can see the trendline, which is helpful for tracking SEO issues:
Recently, Audisto improved their PageRank and CheiRank calculation.
You can now see how much PageRank is distributed to pages with different statuses (200 vs 301 vs 404 and more).
Now, it’s time to point out some disadvantages of Audisto:
- You can’t add additional columns to a report (however, reports contain a lot of KPIs and this should be improved in the next iteration of their software).
- The URL filtering is rather basic. However, you can partially work around this by using custom segmentation.
- Audisto doesn’t offer custom extraction.
- It doesn’t integrate with Google Analytics, or Google Search Console and server logs. But, of course, you can do custom analyses if you use their API.
A packet for 5M URLs cost 320 EUR (~364 USD) per month. 1 million URLs/month cost 150 EUR.
JetOctopus is a relatively new tool in the market of cloud crawlers. They divide issues into six categories:
It offers nice visualizations. Below are some screenshots from the tool.
JetOctopus allows you to define a new segment, which is very easy to use. You just set the proper filter and click on “Save segment” and you don’t need to be familiar with Regular expressions.
Then, you can filter reports to predefined segments.
For now, JetOctopus doesn’t offer server log analysis, but they are in the process of building a dashboard for it.
Linking Explorer – Discover Anchors and Source of Links
I like their linking explorer (a feature added very recently). I can easily see the most popular anchors of links pointing to a page or group of pages.
Also, it shows the most popular directories linking to a page.
Here’s where page segments come in handy. You can quickly switch segments to see only the stats related to links coming from particular segments (i.e from blog or product pages).
Now, some of JetOctopus’s drawbacks:
- No custom extraction.
For now, JetOctopus offers a backend server log analysis and Google Search Console integration, however, they are in the process of building a dashboard for it.
Please remember, it’s a relatively new tool on the market, and cheaper than other cloud tools. I hope they will continue to improve. You can register for a trial and crawl up to 10K, with an unlimited number of products. If you have a few small websites, you can go for the basic package (up to 100K URLs, an unlimited number of projects). It costs 20 euro (~23 USD) per month.
Using the “Onely” promo code, you can get a 10% discount for Jet Octopus.
FandangoSEO is a Spanish crawler, and the name comes from the lively Spanish dance.
Like many other cloud tools, FandangoSEO offers good visualizations. Some screenshots are presented below:
Integration with Server Logs at no Cost
These days, server log analysis has become an integral part of many SEO analyses. FandangoSEO integrates with server logs (and like DeepCrawl, you don’t need to pay extra for it). You can upload logs once or periodically (using their interface or FTP).
Defining custom segments Similarly to Botify, OnCrawl, JetOctopus, in FandangoSEO, you can define custom segments.
Because of this, you can see some reports related to segments.
FandangoSEO requires you to know Regular expressions to define new segments. If you want to learn Regular Expressions, you can read my article on the subject.
FandangoSEO Detects Schema.org
FandangoSEO is one of few crawlers that detects Schema.org, so you can easily see URLs with Schema.org implemented.
Crawling Competitor’s Websites
You can compare data between various projects with this software, which makes it possible to crawl your competitor’s website.
Similarly to Sitebulb and Website Auditor, you can see the architecture map with FandangoSEO.
Integrate Crawls with any Data
When I initially published this article, I wrote that OnCrawl was the only crawler that is able to enrich your crawls with any data (by importing a CSV file with a common field: URL). And voila! In June, Fandango introduced a similar feature.
I’m glad to see crawlers are improving. Good job, FandangoSEO! It’s time to point out some disadvantages of FandangoSEO:
- One of the biggest is that I can’t be filtered.
- Additional columns can’t be added to a report.
- For instance, when viewing a report related to canonicals, information about a number of internal links pointing to a canonicalized page can’t be seen.
- If there are thousands of canonicalized pages, all you can do is export reports to Excel and do the filtering there.
- It doesn’t integrate with Google Analytics and Google Search Console
FandangoSEO’s pricing starts from 59 USD monthly (150k crawled pages, 10 projects), Medium package (600k crawled URLs) cost 177 USD monthly.
Update: since ContentKing first appeared in our test, it added a couple of cool features:
- Advanced filter operators
- Slack integration
- More advanced alerting.
A package for end-users for 50k pages cost 64 USD per month. Also, there are some packages for SEO agencies and enterprises. For agencies, a package for 1 million pages cost 355 USD per month. ContentKing doesn’t charge for recrawls. It charges only for 2xx pages (but not for redirects, pages not found, server errors, and timeouts).
You can use our affiliate link by clicking here.
Cloud-based tools at no additional cost?
Do you use SEMrush for competition analysis? Did you know that this tool offers a crawler?
Yes (it’s included in Corporate plans. For smaller packages: Starter and Consultant: price upon request)
Yes (available in the Business Suite)
Yes (for Advanced and agency plans)
Yes (it’s not included in basic plans)
Yes (it cost 10x more credits)
Yes (it costs 2x more credits)
- Deepcrawl You can use the discount voucher code to get 10% off any annual package by using the code: Onely
- OnCrawl We have created a unique coupon named “Onely-OnCrawlTR2019”. This coupon will give the user a 15% discount on any subscription and is valid until December 31, 2018
- WebSite Auditor Enterprise 10% off checkout
- WebSite Auditor Professional 10% off checkout
- JetOctopus Using the “Onely” promo code, you can get a 10% discount