How To Use Python For IndexNow API Bulk Indexing & Automation via @sejournal, @KorayGubur

1 week ago 6
ARTICLE AD BOX

IndexNow is simply a protocol developed by Microsoft Bing and adopted by Yandex that enables webmasters and SEO pros to easy notify hunt engines erstwhile a webpage has been updated via an API.

And today, Microsoft announced that it is making the protocol easier to instrumentality by ensuring that submitted URLs are shared betwixt hunt engines.

Given its affirmative implications and the committedness of a faster indexing acquisition for publishers, the IndexNow API should beryllium connected each SEO professional’s radar.

Using Python for automating URL submission to the IndexNow API oregon making an API petition to the IndexNow API for bulk URL indexing tin marque managing IndexNow much businesslike for you.

In this tutorial, you’ll larn however to bash conscionable that, with step-by-step instructions for utilizing the IndexNow API to taxable URLs to Microsoft Bing successful bulk with Python.

Note: The IndexNow API is akin to Google’s Indexing API with lone 1 difference: the Google Indexing API is lone for occupation advertisements oregon broadcasting web pages that incorporate a video entity wrong it.

Google announced that they will trial the IndexNow API but hasn’t updated america since.

Bulk Indexing Using IndexNow API with Python: Getting Started

Below are the necessities to recognize and instrumentality the IndexNow API tutorial.

Below are the Python packages and libraries that volition beryllium utilized for the Python IndexNow API tutorial.

  • Advertools (must).
  • Pandas (must).
  • Requests (must).
  • Time (optional).
  • JSON (optional).

Before getting started, reading the basics tin assistance you to recognize this IndexNow API and Python tutorial better. We volition beryllium utilizing an API Key and a .txt record to supply authentication on with circumstantial HTTP Headers.

IndexNow API Usage with PythonIndexNow API Usage Steps with Python.

1. Import The Python Libraries

To usage the indispensable Python libraries, we volition usage the “import” command.

  • Advertools volition beryllium utilized for sitemap URL extraction.
  • Requests volition beryllium utilized for making the GET and POST requests.
  • Pandas volition beryllium utilized for taking the URLs successful the sitemap into a database object.
  • The “time” module is to forestall a “Too overmuch request” mistake with the “sleep()” method.
  • JSON is for perchance modifying the POST JSON entity if needed.

Below, you volition find each of the indispensable import lines for the IndexNow API tutorial.

import advertools arsenic adv import pandas arsenic pd import requests import json import time

2. Extracting The Sitemap URLs With Python

To extract the URLs from a sitemap file, antithetic web scraping methods and libraries tin beryllium utilized specified arsenic Requests oregon Scrapy.

But to support things elemental and efficient, I volition usage my favourite Python SEO bundle – Advertools.

With lone a azygous enactment of code, each of the URLs wrong a sitemap tin beryllium extracted.

sitemap_urls = adv.sitemap_to_df("https://www.example.com/sitemap_index.xml")

The “sitemap_to_df” method of the Advertools tin extract each the URLs and different sitemap-related tags specified arsenic “lastmod” oregon “priority.”

Below, you tin spot the output of the “adv.sitemap_to_df” command.

Sitemap URL Extraction for IndexNow API UsageSitemap URL Extraction tin beryllium done via Advertools’ “sitemap_to_df” method.

All of the URLs and dates are specified wrong the “sitemap_urls” variable.

Since sitemaps are utile sources for hunt engines and SEOs, Advertools’ sitemap_to_df method tin beryllium utilized for galore antithetic tasks including a Sitemap Python Audit.

But that’s a taxable for different time.

3. Take The URLs Into A List Object With “to_list()”

Python’s Pandas room has a method for taking a information framework file (data series) into a database object, to_list().

Below is an illustration usage:

sitemap_urls["loc"].to_list()

Below, you tin spot the result:

Sitemap URL ListingPandas’ “to_list” method tin beryllium utilized with Advertools for listing the URLs.

All URLs wrong the sitemap are successful a Python database object.

4. Understand The URL Syntax Of IndexNow API Of Microsoft Bing

Let’s instrumentality a look astatine the URL syntax of the IndexNow API.

Here’s an example:

https://<searchengine>/indexnow?url=url-changed&key=your-key

The URL syntax represents the variables and their relations to each different wrong the RFC 3986 standards.

  • The <searchengine> represents the hunt motor sanction that you volition usage the IndexNow API for.
  • “?url=” parameter is to find the URL that volition beryllium submitted to the hunt motor via IndexNow API.
  • “&key=” is the API Key that volition beryllium utilized wrong the IndexNow API.
  • “&keyLocation=” is to supply an authenticity that shows that you are the proprietor of the website that IndexNow API volition beryllium utilized for.

The “&keyLocation” volition bring america to the API Key and its “.txt” version.

5. Gather The API Key For IndexNow And Upload It To The Root

You’ll request a valid cardinal to usage the IndexNow API.

Use this link to make the Microsoft Bing IndexNow API Key.

IndexNow API Key Taking There is nary bounds for generating the IndexNow API Key.

Clicking the “Generate” fastener creates an IndexNow API Key.

When you click connected the download button, it volition download the “.txt” mentation of the IndexNow API Key.

IndexNow API Key GenerationIndexNow API Key tin beryllium generated by Microsoft Bing’s stated address.

txt mentation    of IndexNow API KeyDownloaded IndexNow API Key arsenic txt file.

The TXT mentation of the API cardinal volition beryllium the record sanction and arsenic good arsenic wrong the substance file.

IndexNow API Key successful  TXT FileIndexNow API Key successful TXT File should beryllium the aforesaid with the sanction of the file, and the existent API Key value.

The adjacent measurement is uploading this TXT record to the basal of the website’s server.

Since I usage FileZilla for my FTP, I person uploaded it easy to my web server’s root.

Root Server and IndexNow API Set upBy putting the .txt record into the web server’s basal folder, the IndexNow API setup tin beryllium completed.

The adjacent measurement is performing a elemental for a loop illustration for submitting each of the URLs wrong the sitemap.

6. Submit The URLs Within The Sitemap With Python To IndexNow API

To taxable a azygous URL to the IndexNow, you tin usage a azygous “requests.get()” instance. But to marque it much useful, we volition usage a for a loop.

To taxable URLs successful bulk to the IndexNow API with Python, travel the steps below:

  1. Create a cardinal adaptable with the IndexNow API Key value.
  2. Replace the <searchengine> conception with the hunt motor that you privation to taxable URLs (Microsoft Bing, oregon Yandex, for now).
  3. Assign each of the URLs from the sitemap wrong a database to a variable.
  4. Use the “txt” record wrong the basal of the web server with its URL value.
  5. Place the URL, key, and cardinal determination URL wrong the drawstring manipulation value.
  6. Start your for a loop, and usage the “requests.get()” for each of the URLs wrong the sitemap.

Below, you tin spot the implementation:

key = "22bc7c564b334f38b0b1ed90eec8f2c5" url = sitemap_urls["loc"].to_list()for one successful url:           endpoint = f"https://bing.com/indexnow?url={i}&key={key}&keyLocation={location}"           response = requests.get(endpoint)           print(i)           print(endpoint)           print(response.status_code, response.content)           #time.sleep(5)

If you’re acrophobic astir sending excessively galore requests to the IndexNow API, you tin usage the Python clip module to marque the publication hold betwixt each request.

Here you tin spot the output of the script:

IndexNow API Automation ScriptThe bare drawstring arsenic the request’s effect assemblage represents the occurrence of the IndexNow API petition according to Microsoft Bing’s IndexNow documentation.

The 200 Status Code means that the petition was successful.

With the for a loop, I person submitted 194 URLs to Microsoft Bing.

According to the IndexNow Documentation, the HTTP 200 Response Code signals that the hunt motor is alert of the alteration successful the contented oregon the caller content. But it doesn’t needfully warrant indexing.

For instance, I person utilized the aforesaid publication for different website. After 120 seconds, Microsoft Bing says that 31 results are found. And conveniently, it shows 4 pages.

The lone occupation is that connected the archetypal leafage determination are lone 2 results, and it says that the URLs are blocked by Robots.txt adjacent if the blocking was removed earlier submission.

This tin hap if the robots.txt was changed to region immoderate URLs earlier utilizing the IndexNow API due to the fact that it seems that Bing does not cheque the Robots.txt again.

Thus, if you antecedently blocked them, they effort to scale your website but inactive usage the erstwhile mentation of the robots.txt file.

Bing IndexNow API ResultsIt shows what volition hap if you usage IndexNow API by blocking Bingbot via Robots.txt.

On the 2nd page, determination is lone 1 result:

IndexNow Bing Paginated ResultMicrosoft Bing mightiness usage a antithetic indexation and pagination method than Google. The 2nd leafage shows lone 1 among the 31 results.

On the 3rd page, determination is nary result, and it shows the Microsoft Bing Translate for translating the drawstring wrong the hunt bar.

Microsoft Bing TranslateIt shows sometimes, Microsoft Bing infers the “site” hunt relation arsenic a portion of the query.

When I checked Google Analytics, it shows that Bing inactive hadn’t crawled the website oregon indexed it. I cognize this is existent arsenic I besides checked the log files.

Google and Bing Indexing ProcessesBelow, you volition spot the Bing Webmaster Tool’s study for the illustration website:

Bing Webmaster Tools Report

It says that I submitted 38 URLs.

The adjacent measurement volition impact the bulk petition with the POST Method and a JSON object.

7. Perform An HTTP Post Request To The IndexNow API

To execute an HTTP station petition to the IndexNow API for a acceptable of URLs, a JSON entity should beryllium utilized with circumstantial properties.

  • Host spot represents the hunt motor hostname.
  • Key represents the API Key.
  • Key represents the determination of the API Key’s txt record wrong the web server.
  • urlList represents the URL acceptable that volition beryllium submitted to the IndexNow API.
  • Headers correspond the POST Request Headers that volition beryllium utilized which are “Content-type” and “charset.”

Since this is simply a POST request, the “requests.post” volition beryllium utilized alternatively of the “requests.get().”

Below, you volition find an illustration of a acceptable of URLs submitted to Microsoft Bing’s IndexNow API.

data = {   "host": "www.bing.com",   "key": "22bc7c564b334f38b0b1ed90eec8f2c5",   "keyLocation": "https://www.example.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt",   "urlList": [     'https://www.example.com/technical-seo/http-header/',     'https://www.example.com/python-seo/nltk/lemmatize',     'https://www.example.com/pagespeed/broser-hints/preload',     'https://www.example.com/python-seo/nltk/stemming',     'https://www.example.com/python-seo/categorize-queries/',     'https://www.example.com/python-seo/nltk/tokenization',     'https://www.example.com/review/oncrawl/',     'https://www.example.com/technical-seo/hreflang/',     'https://www.example.com/technical-seo/multilingual-seo/'       ] } headers = {"Content-type":"application/json", "charset":"utf-8"} r = requests.post("https://bing.com/", data=data, headers=headers) r.status_code, r.content

In the illustration above, we person performed a POST Request to scale a acceptable of URLs.

We person utilized the “data” entity for the “data parameter of requests.post,” and the headers entity for the “headers” parameter.

Since we POST a JSON object, the petition should person the “content-type: application/json” cardinal and worth with the “charset:utf-8.”

After I marque the POST request, 135 seconds later, my unrecorded logfile investigation dashboard started to amusement the contiguous hits from the Bingbot.

Bingbot Log File Analysis

8. Create Custom Function For IndexNow API To Make Time

Creating a customized relation for IndexNow API is utile to alteration the clip that volition beryllium spent connected the codification preparation.

Thus, I person created 2 antithetic customized Python functions to usage the IndexNow API for bulk requests and idiosyncratic requests.

Below, you volition find an illustration for lone the bulk requests to the IndexNow API.

The customized relation for bulk requests is called “submit_url_set.”

Even if you conscionable capable successful the parameters, inactive you volition beryllium capable to usage it properly.

def submit_url_set(set_:list, key, location, host="https://www.bing.com", headers={"Content-type":"application/json", "charset":"utf-8"}):      key = "22bc7c564b334f38b0b1ed90eec8f2c5"      set_ = sitemap_urls["loc"].to_list()      data = {      "host": "www.bing.com",      "key": key,      "keyLocation": "https://www.example.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt",      "urlList": set_      }      r = requests.post(host, data=data, headers=headers)      return r.status_code

An mentation of this customized function:

  • The “Set_” parameter is to supply a database of URLs.
  • “Key” parameter is to supply an IndexNow API Key.
  • “Location” parameter is to supply the determination of the IndexNow API Key’s txt record wrong the web server.
  • “Host” is to supply the hunt motor big address.
  • “Headers” is to supply the headers that are indispensable for the IndexNow API.

I person defined immoderate of the parameters with default values specified arsenic “host” for Microsoft Bing. If you privation to usage it for Yandex, you volition request to authorities it portion calling the function.

Below is an illustration usage:

submit_url_set(set_=sitemap_urls["loc"].to_list(), key="22bc7c564b334f38b0b1ed90eec8f2c5", location="https://www.example.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt")

If you privation to extract sitemap URLs with a antithetic method, oregon if you privation to usage the IndexNow API for a antithetic URL set, you volition request to alteration “set_” parameter value.

Below, you volition spot an illustration of the Custom Python relation for the IndexNow API for lone idiosyncratic requests.

def submit_url(url, location, cardinal = "22bc7c564b334f38b0b1ed90eec8f2c5"):      key = "22bc7c564b334f38b0b1ed90eec8f2c5"      url = sitemap_urls["loc"].to_list()      for one successful url:           endpoint = f"https://bing.com/indexnow?url={i}&key={key}&keyLocation={location}"           response = requests.get(endpoint)           print(i)           print(endpoint)           print(response.status_code, response.content)           #time.sleep(5)

Since this is for a loop, you tin taxable much URLs 1 by one. The hunt motor tin prioritize these types of requests differently.

Some of the bulk requests volition see non-important URLs, the idiosyncratic requests mightiness beryllium seen arsenic much reasonable.

If you privation to see the sitemap URL extraction wrong the function, you should see Advertools people into the functions themselves.

Tips For Using The IndexNow API With Python

An Overview of How The IndexNow API Works, Capabilities & Uses

  • The IndexNow API doesn’t warrant that your website oregon the URLs that you submitted volition beryllium indexed.
  • You should lone taxable URLs that are caller oregon for which the contented has changed.
  • The IndexNow API impacts the crawl budget.
  • Microsoft Bing has a threshold for the URL Content Quality and Calculation of the Crawl Need for a URL. If the submitted URL is not bully enough, they whitethorn not crawl it.
  • You tin taxable up to 10,000 URLs.
  • The IndexNow API suggests submitting URLs adjacent if the website is small.
  • Submitting the aforesaid pages galore times wrong a time tin artifact the IndexNow API from crawling the redundant URLs oregon the source.
  • The IndexNow API is utile for sites wherever the contented changes frequently, similar each 10 minutes.
  • IndexNow API is utile for pages that are gone and are returning a 404 effect code. It lets the hunt motor cognize that the URLs are gone.
  • IndexNow API tin beryllium utilized for notifying of caller 301 oregon 302 redirects.
  • The 200 Status Response Code means that the hunt motor is alert of the submitted URL.
  • The 429 Status Code means that you made excessively galore requests to the IndexNow API.
  • If you enactment a “txt” record that contains the IndexNow API Key into a subfolder, the IndexNow API tin beryllium utilized lone for that subfolder.
  • If you person 2 antithetic CMS, you tin usage 2 antithetic IndexNow API Keys for 2 antithetic tract sections
  • Subdomains request to usage a antithetic IndexNow API key.
  • Even if you already usage a sitemap, utilizing IndexNow API is utile due to the fact that it efficiently tells the hunt engines of website changes and reduces unnecessary bot crawling.
  • All hunt engines that follow the IndexNow API (Microsoft Bing and Yandex) stock the URLs that are submitted betwixt each other.

IndexNow API Infographic SEOIndexNow API Documentation and usage tips tin beryllium recovered above.

In this IndexNow API tutorial and line with Python, we person examined a caller hunt motor technology.

Instead of waiting to beryllium crawled, publishers tin notify the hunt engines to crawl erstwhile determination is simply a need.

IndexNow reduces the usage of hunt motor information halfway resources, and present you cognize however to usage Python to marque the process much efficient, too.

More resources:

An Introduction To Python & Machine Learning For Technical SEO

How to Use Python to Monitor & Measure Website Performance

Advanced Technical SEO: A Complete Guide


Featured Image: metamorworks/Shutterstock