How Can We Solve Autogenerated URL Errors Once & For All? via @sejournal, @rollerblader

2 weeks ago 10

Today’s “Ask An SEO” question comes from Bhaumik from Mumbai, who asks:

“I person a question astir automatically generated URLs. My steadfast had antecedently utilized antithetic tools to make sitemaps. But recently, we started creating them manually by selecting URLs that are indispensable and blocking others successful robots.txt.

We are facing an contented present with much than 50 auto-generated URLs.

For example, we person a leafage called “keyword keyword” URL: and we person different leafage cognition halfway URL:

In sum issues, we are seeing errors nether the 5xx bid which created wholly caller URLs thing similar We tried galore ways but we are not getting the solution for this one.”

Hi Bhaumik,

It’s an absorbing concern you’re uncovering yourself in.

The bully quality is that 5XX errors thin to resoluteness connected their own, truthful don’t interest astir that one.

The cannibalization contented you’re facing is besides much communal than astir radical think.

With ecommerce stores, for example, you could person the aforesaid merchandise (or the aforesaid postulation of products) look successful aggregate folders.

So, which 1 is the authoritative one?

The aforesaid goes for your concern successful the B2B concern abstraction (I removed your URL supra and replaced it with ”keyword keyword.”)

This is wherefore the hunt engines created canonical links.

Canonical links are a mode to archer hunt engines erstwhile a leafage is simply a duplicate of another, and which leafage is the authoritative one.

Let’s unreal you merchantability pinkish bunny slippers.

These bunny slippers person their ain page, they’re connected sale, they look successful footwear, and besides successful pink.


The archetypal URL supra is the “official version” of the URL.

That means it should person a canonical nexus pointing to itself.

The different 3 pages are duplicate versions of it. So, erstwhile you acceptable up your canonical link, it should notation the authoritative page.

In short, you’ll privation to marque definite each 4 pages person rel=”canonical” href=”” arsenic this volition deduplicate them for hunt engines.

Next, you’ll privation to marque definite that you remove each duplicate versions from your sitemap.

A sitemap is expected to diagnostic the astir important and indexable pages connected your website.

You bash not privation to see non-official versions of a page, pages disallowed by robots.txt, and non-canonicalized URLs successful your sitemaps.

Search engines bash not crawl your full website each clip – and if you nonstop them to unimportant pages, you’re wasting your quality for due crawling and indexing.

There is different concern that tin hap here.

If you person tract hunt enabled, it tin besides make URLs that are duplicates.

If I benignant “pink bunny slippers” into your site’s hunt box, I’m apt going to get a URL with the aforesaid keyword operation successful the URL – and besides with parameters connected it.

This would further your problem, and your IT squad volition request to programmatically acceptable the canonical links to the hunt results on with a meta robots for noindex, follow.

One different happening to look for is: If I click to the pinkish bunny slippers leafage from the hunt result, these parameters whitethorn stick.

If they do, instrumentality the aforesaid steps mentioned above.

Using due canonical links and ensuring your sitemap doesn’t person non-official pages volition assistance lick the duplicate leafage occupation and assistance guarantee you don’t discarded a spider’s sojourn by having it crawl the incorrect pages connected your site.

I anticipation this helps!

More resources:

Featured Image: Leremy/Shutterstock

Editor’s note: Ask an SEO is a weekly SEO advice column written by immoderate of the industry’s top SEO experts, who person been hand-picked by Search Engine Journal. Got a question about SEO? Fill retired our form. You mightiness spot your reply successful the adjacent #AskanSEO post!