How To Find and Fix Orphan Pages (the Right Way)

2 years ago 117
ARTICLE AD BOX

Quicksand awaits unsuspecting SEOs erstwhile they commencement moving connected a website with a agelong history.

These pits of method tract errors, littered by respective generations of erstwhile agencies, dilatory down and hinder SEO efforts and progress. 

And erstwhile you’re the 1 tasked to cleanable it up, uncovering the speedy fixes is your fig one task.

So, you whitethorn commencement with a basal tract audit and spot respective orphan pages. You’ve astir apt heard that orphan pages are atrocious for a tract but bash not afloat recognize what they are and however to fix them.

In this article, you’ll learn:

Orphan pages are pages that hunt engines whitethorn person trouble discovering due to the fact that they person nary interior links from elsewhere connected your website. 

These URLs thin to autumn done the cracks due to the fact that hunt motor crawlers tin lone observe pages from the sitemap record oregon outer backlinks, and users tin lone get to the leafage if they cognize the URL.

What causes orphan pages?

Usually, orphan pages are accidental and hap for assorted reasons. The astir communal origin is not having processes for site migrations, navigation changes, tract redesigns, out-of-stock products, testing, oregon dev pages. 

Orphan pages whitethorn besides beryllium intentional, arsenic with promotional and paid advertizing landing pages, oregon immoderate lawsuit wherever you bash not privation the leafage to beryllium portion of the idiosyncratic journey.

Why are orphan pages atrocious for SEO?

Search engines person a hard clip uncovering orphan pages due to the fact that they usage links to assistance observe caller contented and recognize the page’s significance.

Here’s what Google says:

Google searches the web with automated programs called crawlers, looking for pages that are caller oregon updated. […] We find pages by galore antithetic methods, but the main method is pursuing links from pages that we already know about.

For example, let’s accidental you people a caller web leafage and hide to nexus to it from elsewhere connected your site. If the leafage isn’t successful your sitemap and has nary backlinks, Google volition not find oregon scale it. That’s due to the fact that their web crawler doesn’t cognize that it exists.

Even worse, the leafage cannot person PageRank. 

If you haven’t heard of the word “PageRank” before, it’s a big deal. 

Generally speaking, PageRank is Google’s mode of knowing the value of the leafage by counting the fig of “votes” a leafage gets. You tin work much astir however PageRank works and affects SEO here.

To find orphan pages connected your site, you request to comparison a database of crawlable URLs (what Google tin find) with a database of URLs radical are hitting connected your site. 

This whitethorn dependable rather technical, but don’t beryllium discouraged. We person breached down however to find orphan pages into 3 casual steps utilizing tools you’re acquainted with. 

1. Find crawlable URLs

There are a batch of tools you tin usage to stitchery a database of each crawlable URLs. We’re going to usage Ahrefs’ Site Audit due to the fact that it’s wholly escaped with an Ahrefs Webmaster Tools account and you person the enactment to usage outer backlinks arsenic a root to find adjacent more URLs.

Here’s however to do it:

  1. Go to Site Audit.
  2. Click + New Project.
  3. Follow the prompts until measurement 3. Click connected the URL sources tab and cheque Backlinks as a URL root successful summation to the default settings.
  4. Click Continue, travel the instructions to implicit the setup, past tally the crawl.
Scheduling a tract  audit successful  Ahrefs' Site Audit

Backlink information is utile for uncovering orphan pages due to the fact that it brings URLs from Ahrefs’ nexus scale into the mix. 

If a leafage does not person immoderate interior links, a basal crawler won’t find it. 

But, if a leafage has a backlink, Ahrefs volition find the URL connected your tract and cognize that the crawl recovered nary interior links, truthful it indispensable beryllium an orphan page.

When the tract audit is complete, export each interior pages from Page Explorer and prevention them. You’ll usage this successful step 3.

Page Explorer successful  Ahrefs' Site Audit

Before we continue…

As Site Audit uses some sitemaps and backlinks arsenic URL sources, it does a tenable occupation of uncovering orphan pages for you without immoderate other work. To spot them, spell to Page Explorer, click Links, and prime Orphan pages:

Orphan pages successful  Ahrefs' Site Audit

However, you’ll lone spot orphan pages recovered via backlinks oregon sitemaps here. If you person orphan pages not included successful sitemaps and without backlinks, Ahrefs won’t beryllium capable to find them. 

Keep speechmaking if you deliberation this whitethorn beryllium the lawsuit for you and privation to excavation a small deeper for orphan pages.

2. Find URLs with hits

The adjacent measurement is getting a database of each the URLs with hits connected our site. 

There are rather a fewer ways to bash this, and it’s ever champion to usage arsenic galore information sources arsenic you person entree to. 

If you person access, log files enactment good due to the fact that they are server-side information which is much accurate. We won’t beryllium going into the nitty-gritty of accessing these due to the fact that it depends connected however the server is set up. 

But if you take to spell this route, present are 3 authoritative guides for communal server types:

In this article, we volition usage Google Analytics (GA4) and Google Search Console due to the fact that the process is fundamentally the aforesaid for everyone. 

Here’s however to find URLs with hits successful Google Analytics (GA4):

  1. Log successful to your Data Studio account.
  2. Start a caller blank report.
  3. Connect Google Analytics as your information source.
  4. Choose the relationship you’re analyzing > prime GA4 property.
  5. Add a basal array to your report.
  6. Set information root to the GA4 spot created successful step 4.
  7. Set magnitude to Page path.
  8. Set metric to Views.
  9. Sort by Views in descending order.
  10. Set default day scope to earlier GA4 was installed connected the site.
Google Data Studio settings

To export the results from your table, click the 3 vertical dots successful the apical close country and deed Export. Save with a adjuvant sanction similar “date_GA_URLs_people_are_hitting_brandname” due to the fact that you volition request it again successful conscionable a bit.

Because we exported the leafage way and not the afloat leafage URL, we request to adhd the domain to the opening of each cells successful our spreadsheet. This is casual capable successful Google sheets. Just import the CSV into a blank sheet, insert a caller file to the left, and paste this look into compartment A1 (make definite to regenerate example.com with your domain): 

=IFERROR(ARRAYFORMULA(IF(ISBLANK(B:B),"",IF(B:B="Page Path","",IF(B:B="(not set)","","https://example.com" & B:B)))))

Formula successful  Google Sheets

As aggregate URL sources are ever best, we volition besides propulsion information from Google Search Console (GSC).

GSC limits exports to the archetypal 1,000 URLs, but Google Data Studio has a neat small instrumentality that allows you to pull more. 

Here’s however to do it:

  1. Reopen your Data Studio report.
  2. Start a caller leafage (command + M).
  3. Open Resource > Manage added information sources.
  4. Click ADD A DATA SOURCE.
  5. Select Search Console.
  6. Choose the tract you’re analyzing > URL impression > web.
  7. Add a basal array to your report.
  8. Set magnitude to Landing page.
  9. Set metric to Impressions.
  10. Expand rows per leafage to 5,000.
  11. Edit the day scope to presumption astatine slightest the past 3 months.
  12. Export the results from your table. 

Name your expanse thing adjuvant similar “date GSC_URLs_people_are_hitting_brandname” due to the fact that you’ll request it again successful a moment. 

Now, harvester each the URLs radical are hitting from your antithetic sources into 1 spreadsheet and cleanable up the information by removing duplicates. 

Remove duplicates Google Sheets

3. Cross-reference the 2 URL sources

You are successful the location stretch! The past measurement is cross-referencing crawlable URLs (from Ahrefs’ Site Audit) and URLs with hits (from GA and GSC). To bash this, make a blank Google Sheet and make 3 tabs. Label them crawl, hits, and transverse reference. 

The 3  sheets you request   successful  Google Sheets

In the archetypal sheet, crawl, transcript and paste each of the crawlable URLs from Ahrefs Site Audit. 

To find these, unfastened the exported CSV from measurement 1 and filter for results with incomingAllLinks adjacent to zero. This is super important because these are orphan pages, truthful including them successful the “crawl” tab volition pb to inaccurate results erstwhile cross-referencing. 

Remove each  IncomingAllLinks that adjacent   zero

Instead, you should transcript these URLs and adhd them to the “hits” tab. 

Next, transcript and paste the remaining URLs from the Ahrefs export into the crawl tab of your Google Sheet.

Crawl URLs successful  spreadsheet

In the 2nd sheet, hits, copy/paste each URLs from measurement 2. These are the pages you recovered utilizing Google Analytics, Google Search Console, oregon your tract log files. It includes web pages that users person visited.

Hit URLs successful  spreadsheet

In the 3rd sheet, cross reference, participate the pursuing relation into the first cell: 

=UNIQUE(FILTER(hits!A:A, ISNA(MATCH (hits!A:A, crawl!A:A, 0))))

Hit enter. The relation volition automatically propulsion each of your orphan pages for casual analysis.

Orphan URLs successful  spreadsheet

Marketers often marque the mistake of simply adding interior links to each orphan pages crossed the board. 

The main contented with this attack is that conscionable due to the fact that a speedy hole tin beryllium applied crossed each pages does not mean it should be. 

Some orphan pages are intentional, similar PPC landing pages, portion others tin conscionable beryllium removed, similar test pages.

We don’t privation to discarded resources fixing thing that’s not breached oregon is improbable to person a affirmative impact.

To assistance lick this problem, usage this determination tree:

How to woody  with orphan pages flowchart

The thought present is to deliberation critically astir each orphan leafage and determine whether noindexing, deleting, merging/consolidating, oregon simply adding interior links is the best fix.

For example, if a leafage was missed during a tract migration and that leafage does not connection immoderate worth for visitors, deleting is astir apt the champion option. However, if the leafage has backlinks, it whitethorn besides beryllium worthy redirecting the URL to different applicable leafage to sphere backlink equity. 

TIP

Checking orphan pages for backlinks successful bulk (up to 200 URLs astatine a time) is casual with Ahrefs’ Batch Analysis tool. Just paste URLs from your transverse notation expanse and click Analyse.

Batch Analysis instrumentality   successful  Ahrefs

Let’s look astatine the 4 strategies to hole orphan pages.

Internally link

Orphan pages that are invaluable for tract visitors should beryllium incorporated into your site’s interior linking operation to marque them easier for visitors and hunt engines to find. 

For example, let’s accidental an nonfiction was forgotten during a tract migration oregon redesign. We request to internally nexus to it from a applicable leafage we cognize Google volition soon (re)crawl.

Here’s an casual mode to bash that successful Ahrefs:

  1. Go to Site Audit
  2. Open your site’s astir caller crawl 
  3. Under Tools > Open Page Explorer.
  4. Search for a connection oregon operation successful Page text.
  5. Sort the results by Organic traffic.
Finding interior   nexus  opportunities successful  Ahrefs' Site Audit

This finds contextual interior linking opportunities connected pages that get integrated traffic, which means Google is apt to recrawl them sooner alternatively than aboriginal and spot our changes. 

Learn more:How to Use Page Explorer

Noindex

Orphan pages that were intentionally not internally linked to, similar landing pages for ads, should beryllium noindexed to forestall them from appearing successful integrated hunt results. 

Most SEO plugins person made this arsenic casual arsenic checking a box, but you tin besides bash it manually by copying and pasting this into the <head> conception of the page:

<meta name="robots" content="noindex" />

Sidenote.

Make definite these pages are inactive crawlable successful robots.txt, different hunt engines won’t spot the noindex directive. 

Merge/consolidate

Orphan pages with the aforesaid oregon akin contented to different leafage should beryllium merged. This means consolidating the contented and redirecting the orphan URL to the other page.

For example, let’s accidental you person 2 merchandise listings for the aforesaid product. One of them is an orphan page; the different isn’t. You should instrumentality immoderate unsocial invaluable accusation from the orphan leafage and adhd it to the different leafage earlier redirecting the orphan page there.

Delete

Orphan pages that connection nary worth for visitors and service nary different intent (e.g., paid postulation campaign) should beryllium deleted. 

For example, an unused CMS taxable leafage tin beryllium removed. This volition effect successful a 404 leafage and people driblet retired of hunt results over time.

Sidenote.

If the leafage has backlinks, you whitethorn privation to redirect the URL to different applicable leafage to sphere nexus equity aft deleting. 

How to forestall orphan pages

As you tin see, auditing orphan pages is time-intensive. So erstwhile you’ve enactment successful the work, you privation to forestall orphan pages successful the future. Here are a fewer policies and procedures to consider.

Have a program for tract migrations

Be proactive by having a program immoderate clip you bash a website migration. You tin debar breached links and disorder connected your website by redirecting aged pages to caller versions with a 301 redirect.

Set up your tract operation for success

If you person to internally nexus to caller pages manually, you’re bound to miss immoderate and extremity up with orphan pages. This is wherefore you should opt for a tract operation that handles interior linking for you. 

Most CMS’ bash this retired of the box. For example, each clip we people a caller blog post, WordPress adds an interior nexus from our blog homepage and archive. 

However, if you’re utilizing a customized solution, you request to guarantee the indispensable codification is successful spot for a bully tract structure.

Learn more: Website Structure: How to Build Your SEO Foundation

Remove discontinued products properly

If you tally an ecommerce site, you should region discontinued products from the catalog along with each interior links pointing to them and acceptable a presumption codification of 404 oregon 410. Failing to region interior links to specified products is simply a communal origin of orphan pages.

If the leafage has large backlinks and determination is an updated oregon improved mentation of the product, you whitethorn privation to see keeping the leafage to sphere the backlink equity.

To bash this, update the leafage contented to explicate wherefore the merchandise is nary longer available, including introducing the caller plan features and linking to the caller merchandise page.

This way, the idiosyncratic is not landing connected a wholly unrelated leafage or 404.

Run regular tract audits

By moving the audit each month, you tin enactment connected apical of immoderate accidental orphan pages that whitethorn gaffe done the cracks. You tin bash this easy utilizing the scheduling diagnostic successful Ahrefs’ Site Audit.

Final thoughts

Looking astatine rows and rows of orphan leafage errors and trying to marque consciousness of dense method jargon is intimidating.

While uncovering and fixing orphan pages is time-intensive, it doesn’t request to beryllium painstaking. Using Ahrefs Site Audit and the orphan pages flowchart volition assistance streamline your process.

Got questions? Ping maine on Twitter.