8 Crawlability Problems That Are Hurting Your SEO

1 month ago 18
ARTICLE AD BOX

You've researched high-value people keywords and created applicable content, but nary postulation is coming to your site. What's wrong? 

The culprit could beryllium method usability factors of your site. The most communal method SEO issues hunt motor spiders brushwood involve crawling the site. 

Googlebot needs to crawl and scale your tract decently for your web pages to fertile successful the hunt results which means crawlability issues tin descend immoderate SEO effort. 

In summation to your tract not being crawled by the hunt engine, it's apt that immoderate method SEO issues volition besides impact idiosyncratic experience. For instance, if the spiders can't travel your website path, neither volition your users

Not to mention, it's important that your tract tin efficiently beryllium crawled to optimize crawl budget. To debar these consequences, let's spell implicit the apical crawlability issues that wounded SEO truthful you cognize what to look retired for.

Table of Contents:

Is My Site Crawlable? How to Test Website Crawlability

Crawls tin observe imaginable crawlability issues, helping you get up of them to debar problems with hunt engines speechmaking and indexing your content.

We urge you tally 2 types of crawls utilizing a crawler tool

  1. A crawl of the tract that starts from the location page. Let the crawler escaped connected the tract to mimic Google's web crawler (Googlebot). 
  2. A crawl of landing pages for SEO, ideally aligned with the XML sitemaps.

The information from these crawls volition assistance diagnose crawl problems and hint you successful connected whether your pages are crawlable.

More insights volition travel from much crawls with further variables specified arsenic mounting the idiosyncratic cause to Googlebot, a mobile instrumentality to spot the mobile experience, and rendering JavaScript arsenic opposed to conscionable the HTML.

Note: You tin prevention clip by redeeming these settings and scheduling future, recurring crawls.

Follow our guide to crawling endeavor sites, oregon petition a escaped tract audit to analyse the method integrity of your site. 

Request a Free Site Audit

The Most Common Crawlability Issues Sorted By Priority

A crawl study from an enterprise-level tract tin instrumentality a batch of information since it whitethorn incorporate thousands oregon adjacent millions of pages! 

But not each crawl errors transportation the aforesaid weight.

We've separated crawl issues into 3 categories (high-, mid-, and low-priority) truthful you tin prioritize (and resolve) issues affecting your site's crawlability. 

High-Priority Crawl Issues

The pursuing issues volition person the largest interaction connected your site's crawlability and should beryllium prioritized first.

#1. URLs Blocked by Robots.txt

The archetypal happening a bot volition look for connected your tract is your robots.txt file. You tin nonstop Googlebot by specifying “disallow” connected pages you don’t privation them to crawl.

User-agent: Googlebot

Disallow: /example/

This is 1 of the astir communal sources of a site's crawlability problems arsenic the directives successful this record could artifact Google from crawling your astir important pages oregon vice versa. 

How to spot this problem:

  1. Google Search Console – Google Search Console blocked assets study shows a database of hosts that supply resources connected your tract that are blocked by robots.txt rules. 
  2. Crawl – Analyze your ain crawl outputs outlined above. Identify pages flagged for being blocked via the robots.txt file.

These could stem from a mistake successful regex codification oregon a typo that tin origin large problems.

#2. Server (5xx) and Not Found (404) Errors

Like being blocked, if Google arrives astatine a leafage and encounters 5xx oregon 404 errors it’s a large problem.

A web crawler travels done the web by pursuing links. Once the crawler hits the 404 oregon 500 mistake page, it’s a dormant extremity for the bot.

When a bot hits a ample fig of mistake pages, it volition yet springiness up crawling the page, and your site.

How to spot this problem:

  1. Google Search Console – Google Search Console reports the server errors and 404 (aka breached links) it encounters. The Fetch and Render Tool besides serves arsenic a utile constituent solution.
  2. Analyze the outputs from regularly scheduled crawls for server errors. Also enactment issues specified arsenic re-direct loops, meta refreshes, and each different circumstances wherever yet Google cannot entree the page.

#3. SEO Tag Errors

Look for issues with the tags that are directives to Google (i.e. canonical oregon hreflang). These tags could beryllium missing, incorrect, oregon duplicated, perchance confusing crawlers. 

How to spot this problem:

  1. Google Search Console – These issues whitethorn look successful Google Search Console but not beryllium interpreted arsenic errors. For example, if a tract has duplicate contented due to the fact that of a missing canonical tag, hunt engines volition effort to scale these pages. Within GSC, the "number of pages indexed" volition rise, which unsocial is not an "error." The tag issues typically aboveground successful the "HTML improvements" and planetary conception wrong GSC.
  2. Analyze the crawl outputs for immoderate missing oregon incorrect values. Pay peculiar attraction to the cardinal landing pages for SEO. Keep a grounds of the cardinal elements for each leafage (directives specified arsenic "noindex") you expect to see.

Note: Platform users tin acceptable rules to propulsion retired changes successful these elements flagged by "high priority" rules specified arsenic "Noindex detected" wherever determination shouldn't beryllium and tin person a large interaction connected the site. This is simply a large illustration of however tract audit exertion tin standard SEO tasks.Recommended Reading: Crawl Depth successful SEO: How to Increase Crawl Efficiency

Mid-Priority Crawl Issues

After you've identified and resolved the captious issues above, determination connected to these mid-priority crawlability problems.

#4. Rendering Issues

Google’s quality to render JavaScript is improving. Although Progressive Enhancement is inactive the recommended method (where each the contented would look successful the HTML root code), it’s utile to afloat render pages the mode Google does to acquisition what a searcher would find connected the page.

How to spot this problem:

  1. Google Search Console – Fetch and Render Tool. If the “rendered” mentation does not incorporate the captious contented connected the leafage past determination is apt a occupation to address. This should besides lucifer the cached mentation of a page.
  2. Analyze the results of a JS-rendered crawl – determination whitethorn beryllium crawl issues (missing content, breached links, etc.) unsocial to the rendered crawl. Here's a large nonfiction for much connected optimizing JavaScript for SEO.

#5. Duplicate Content from Technical Issues (Spider Traps)

Some issues stem from Google or different hunt engines not knowing which mentation of the contented to scale due to the fact that of a coding setup.

Examples see pages with galore parameters successful the URL, league IDs, redundant contented elements, and pagination.

How to spot this issue:

  1. Google Search Console - There is sometimes an alert for "Too Many URLs" oregon akin connection erstwhile Google believes it's encountering much URLs and contented than possibly it should be. Check the messages and marque definite you're receiving them arsenic emails too.
  2. Crawl Results - a web crawl volition place these successful a fewer ways. The astir evident volition beryllium duplicate oregon missing values successful areas specified arsenic the rubric tag oregon header tags - possibly interior hunt pages oregon merchandise class filters that don't update the meta tags. URLs that look unrecognizable (e.g. with parameters oregon other characters) tin beryllium an contented too. These pages whitethorn beryllium a occupation arsenic they're creating much enactment than indispensable for Google to entree and scale precedence pages.

Once you find these instances connected your site, find ways to either region the instauration of the pages, set Google's access, oregon cheque they person the close tags (such arsenic canonical, noindex, oregon nofollow) to marque definite they don't interfere with your people landing pages.

Recommended Reading: Technical SEO: Best Practices to Prioritize Your SEO Tasks

Low-Priority Crawl Issues

While these crawlability problems are listed arsenic "low-priority," it's inactive important to place and resoluteness them to optimize your site's crawl budget.

#6. Site Structure and Internal Linking

How a website interlinks related posts is important for indexation. A leafage that is portion of a wide website operation and is interlinked wrong contented has small obstruction to indexation.

How to spot this issue:

  1. Analytics - Review your site's analytics to find however users are flowing done the site. Identify ways to support them engaged by linking to related content. Be connected the lookout for pages with precocious bounce rates that whitethorn request a clearer nudge to much content. 
  2. Analyze precocious crawl features that amusement however galore interior links an idiosyncratic leafage has directed to it. Review the top-performing pages for ways the tract interlinks to those pages.

Keep an oculus retired for champion signifier elements successful this measurement specified arsenic nary interior 301 redirects, close pagination, and implicit sitemaps.

#7. Mobile Usability

Mobile usability has been a cardinal precedence for SEO since the roll-out of Google’s mobile-first index. If the tract is deemed unusable for mobile devices, Google whitethorn fertile them little successful the SERP resulting successful mislaid traffic.

How to spot this issue:

  1. Google Tools - Test your cardinal landing pages successful the Google Mobile Friendly Tester instrumentality and show mobile issues wrong Google Search Console.
  2. Analyze a Mobile Crawl - Review the output of a crawl ran arsenic a mobile instrumentality and guarantee the site's contented appears. Any issues with mobile navigation oregon usability should originate present if the contented you expect to find is missing.

#8. Thin Content

If it's confirmed that your tract doesn't person immoderate of the issues outlined supra but inactive isn't indexed, you whitethorn person "thin content." Google is alert of pages with low-value contented (i.e. contented that is poorly written oregon doesn't reply hunt intent), it conscionable doesn't judge they are worthwhile to index.

The contented connected these pages whitethorn beryllium boilerplate, look determination other connected your website, oregon not person immoderate outer signals validating its value/authority (i.e. nary links to it). 

How to spot this problem:

  1. Analyze the site's contented that is not indexed by Google (you tin proxy this by people landing pages not receiving traffic), and reappraisal the people queries for the page. Refresh the contented oregon make caller contented based connected keyword probe to supply amended value.

Conclusion

Sites escaped of crawlability issues bask applicable postulation from Google and different hunt engines and absorption connected bettering a hunt acquisition arsenic opposed to fixing problems.

Achieving this isn't easy, particularly if you person constricted clip to lick these crawability problems. Spotting and fixing these issues tin instrumentality effort from dozens of radical – from a web plan squad to developers, contented writers, and different stakeholders.

This is wherefore it's important to find the apical problems affecting your performance, make a program to hole them, and instrumentality standards to suppress immoderate aboriginal issues.

Enter Clarity Audits, our site audit technology that includes a built-in JS and HTML crawler. It swiftly identifies crawlability issues and conducts thorough method wellness checks of your tract to guarantee afloat tract optimization.

Want to spot it successful action? Request a FREE tract audit today!


 Editor's Note: This station was primitively published successful May 2018 and has been updated for accuracy and comprehensiveness.