How to Search Through the Source Code of the Entire Website

1 month ago 19
ARTICLE AD BOX

Ahrefs Site Audit, besides disposable arsenic portion of the escaped Ahrefs Webmaster Tools, allows you to hunt done the earthy HTML codification oregon the JS-rendered codification crossed all crawled pages of the website.

This diagnostic is peculiarly utile erstwhile you request to verify analytics tags, place pages that telephone definite scripts oregon stylesheets, observe unwanted injections into the leafage code, oregon probe the competitors’ technologies.

It is important to recognize that successful the epoch of JavaScript-powered websites, the leafage codification tin beryllium successful two forms:

Raw (Source): the HTML codification earlier immoderate JavaScript connected the leafage has been executed. This is what you spot utilizing the “View Page Source” diagnostic successful the browser.

Rendered: the last HTML codification aft being altered/generated by JavaScript. It is disposable successful the “Inspect” mode successful the browser.

The root and rendered versions tin beryllium importantly different, truthful it’s important to guarantee you’re searching done the close mentation of the page code.

How to hunt done the rendered codification of the pages

If you request to hunt done the JS-rendered HTML codification of each the pages connected the website, tally a crawl successful Site Audit oregon Ahrefs Webmaster Tools. Ensure that the “Execute JavaScript” enactment is activated successful the crawl settings.

Execute JavaScript setting

Once the crawl is complete, spell to the Page Explorer and entree the Advanced filter. Select ‘Page source’ followed by ‘Contains’ from the dropdown menu. Then, participate the circumstantial portion of codification you are searching for.

Advanced filter

The illustration supra finds each pages connected our blog that that incorporate an embedded table.

How to hunt done the earthy HTML of the pages

Searching done the earthy HTML (also called root HTML) requires a fewer other actions:

1. Disable JavaScript rendering successful the crawl settings

Execute JavaScript mounting  - off

2. Ensure discoverability of each pages by the crawler.

This is important for websites wherever leafage contented (including the interior links) is generated via JavaScript, arsenic AhrefsSiteAudit bot whitethorn not automatically observe each pages via earthy HTML code.

That’s wherefore you request to proviso the Site Audit instrumentality with a database of input URLs that we telephone “Seeds.”

The easiest mode to bash that is to marque definite that the Sitemaps are utilized successful the “URL Sources.” If that’s not feasible, usage the Custom URL list.

URL Sources

When the crawl is finished, usage the precocious filter to hunt done the root codification of each crawled pages.