How SEO Experts Can Utilize ChatGPT For BigQuery With Examples via @sejournal, @vahandev

3 months ago 87
ARTICLE AD BOX

AI is shaping each tract by making skills (such arsenic coding oregon data visualization) accessible to everyone, which weren’t disposable successful the past.

An AI relation who tin tally the close prompts tin execute low- and medium-level trouble tasks, allowing much absorption connected strategical decision-making.

In this guide, we volition locomotion you done measurement by measurement however to usage AI chatbots with ChatGPT arsenic an illustration to tally analyzable BigQuery queries for your SEO reporting needs.

We volition reappraisal 2 examples:

It volition besides springiness you an wide thought of however you tin usage chatbots to trim the load erstwhile moving SEO reports.

Why Do You Need To Learn BigQuery?

SEO tools similar Google Search Console oregon Google Analytics 4 person accessible idiosyncratic interfaces you tin usage to entree data. But often, they bounds what you tin bash and amusement incomplete data, which is usually called information sampling.

In GSC, this happens due to the fact that the instrumentality omits anonymized queries and limits array rows to up to 1,000 rows.

Screenshot from GSC Screenshot from Google Search Console, May 2024

By utilizing BigQuery, you tin lick that occupation and tally immoderate analyzable reports you want, eliminating the information sampling contented that occurs rather often erstwhile moving with ample websites.

(Alternatively, you whitethorn effort utilizing Looker Studio, but the intent of this nonfiction is to exemplify however you tin run ChatGPT for BigQuery.)

For this article, we presume you person already connected your GSC and GA4 accounts to BigQuery. If you haven’t done it yet, you whitethorn privation to cheque our guides connected however to bash it:

SQL Basics

If you cognize Structured Query Language (SQL), you whitethorn skip this section. But for those who don’t, present is simply a speedy notation to SQL statements:

Statement Description
SELECT Retrieves information from tables
INSERT Inserts caller information into a table
UNNEST Flattens an array into a acceptable of rows
UPDATE Updates existing information wrong a table
DELETE Deletes information from a table
CREATE Creates a caller array oregon database
ALTER Modifies an existing table
DROP Deletes a array oregon a database.

The conditions we volition beryllium utilizing truthful you tin familiarize yourself:

Condition Description
WHERE Filters records for circumstantial conditions
AND Combines 2 oregon much conditions wherever each conditions indispensable beryllium true
OR Combines 2 oregon much conditions wherever astatine slightest 1 information indispensable beryllium true
NOT Negates a condition
LIKE Searches for a specified signifier successful a column.
IN Checks if a worth is wrong a acceptable of values
BETWEEN Select values wrong a fixed range
IS NULL Checks for null values
IS NOT NULL Checks for non-null values
EXISTS Checks if a subquery returns immoderate records

Now, let’s dive into examples of however you tin usage BigQuery via ChatGPT.

1. How To Analyze Traffic Decline Because Of Google Algorithm Impact 

If you person been affected by a Google algorithm update, the archetypal happening you should bash is tally reports connected affected pages and analyse wherefore you person been impacted.

Remember, the worst happening you tin bash is commencement changing thing connected the website close distant successful panic mode. This whitethorn origin fluctuations successful hunt postulation and marque analyzing the interaction adjacent harder.

If you person less pages successful the index, you whitethorn find utilizing GSC UI information satisfactory for analyzing your data, but if you person tens of thousands of pages, it won’t fto you export much than 1,000 rows (either pages oregon queries) of data.

Say you person a week of information since the algorithm update has finished rolling retired and privation to comparison it with the erstwhile week’s data. To tally that study successful BigQuery, you whitethorn commencement with this elemental prompt:

Imagine you are a information expert experienced successful Google Analytics 4 (GA4), Google Search Console, SQL, and BigQuery. Your task is to make an SQL query to comparison 'WEB' Search Console information for the periods '2024-05-08' to '2024-05-20' and '2024-04-18' to '2024-04-30'. Extract the full clicks, impressions, and mean presumption for each URL for each period. Additionally, cipher the differences successful these metrics betwixt the periods for each URL (where mean presumption should beryllium calculated arsenic the sum of positions divided by the sum of impressions). Details: BigQuery task name: use_your_bigquery_projectname Dataset name: searchconsole Table name: searchdata_url_impression Please supply the SQL query that meets these requirements.

Once you get an SQL code, transcript and paste it into the BigQuery SQL editor, but I stake the archetypal codification you volition get volition person errors. For example, array file names whitethorn not lucifer what is successful your BigQuery dataset.

Error successful  BigQuery SQL erstwhile   file  sanction  doesn't lucifer  the dataset column.Error successful BigQuery SQL erstwhile file sanction doesn’t lucifer the dataset column.

Things similar this hap rather often erstwhile performing coding tasks via ChatGPT. Now, let’s dive into however you tin rapidly hole issues similar this.

Simply click connected your dataset successful the left-right panel, prime each columns connected the close side, and click Copy arsenic Table.

How to prime   each  columns of array  successful  bigquery.How to prime each columns of the array successful BigQuery.

Once you person it, conscionable transcript and paste it arsenic a follow-up punctual and deed enter.

Generate SQL again by taking into relationship that my array schema is arsenic follows : "fullname mode benignant description data_date DATE site_url STRING url STRING query STRING is_anonymized_query BOOLEAN is_anonymized_discover BOOLEAN country STRING search_type STRING device STRING is_amp_top_stories BOOLEAN is_amp_blue_link BOOLEAN is_job_listing BOOLEAN is_job_details BOOLEAN is_tpf_qa BOOLEAN is_tpf_faq BOOLEAN is_tpf_howto BOOLEAN is_weblite BOOLEAN is_action BOOLEAN is_events_listing BOOLEAN is_events_details BOOLEAN is_search_appearance_android_app BOOLEAN is_amp_story BOOLEAN is_amp_image_result BOOLEAN is_video BOOLEAN is_organic_shopping BOOLEAN is_review_snippet BOOLEAN is_special_announcement BOOLEAN is_recipe_feature BOOLEAN is_recipe_rich_snippet BOOLEAN is_subscribed_content BOOLEAN is_page_experience BOOLEAN is_practice_problems BOOLEAN is_math_solvers BOOLEAN is_translated_result BOOLEAN is_edu_q_and_a BOOLEAN impressions INTEGER clicks INTEGER sum_position INTEGER is_product_snippets BOOLEAN is_merchant_listings BOOLEAN is_learning_videos BOOLEAN "

This volition hole the generated SQL according to the GSC array operation arsenic follows:

WITH period1 AS ( SELECT url, SUM(clicks) AS clicks, SUM(impressions) AS impressions, IFNULL(SUM(sum_position) / NULLIF(SUM(impressions), 0), 0) AS avg_position FROM `use_your_bigquery_projectname.searchconsole.searchdata_url_impression` WHERE data_date BETWEEN '2024-05-08' AND '2024-05-20' AND search_type = 'WEB' GROUP BY url ), period2 AS ( SELECT url, SUM(clicks) AS clicks, SUM(impressions) AS impressions, IFNULL(SUM(sum_position) / NULLIF(SUM(impressions), 0), 0) AS avg_position FROM `use_your_bigquery_projectname.searchconsole.searchdata_url_impression` WHERE data_date BETWEEN '2024-04-18' AND '2024-04-30' AND search_type = 'WEB' GROUP BY url ), comparison AS ( SELECT COALESCE(p1.url, p2.url) AS url, IFNULL(p1.clicks, 0) AS clicks_period1, IFNULL(p2.clicks, 0) AS clicks_period2, IFNULL(p1.impressions, 0) AS impressions_period1, IFNULL(p2.impressions, 0) AS impressions_period2, IFNULL(p1.avg_position, 0) AS avg_position_period1, IFNULL(p2.avg_position, 0) AS avg_position_period2, IFNULL(p1.clicks, 0) - IFNULL(p2.clicks, 0) AS clicks_diff, IFNULL(p1.impressions, 0) - IFNULL(p2.impressions, 0) AS impressions_diff, IFNULL(p1.avg_position, 0) - IFNULL(p2.avg_position, 0) AS avg_position_diff FROM period1 p1 FULL OUTER JOIN period2 p2 ON p1.url = p2.url ) SELECT url, clicks_period1, clicks_period2, clicks_diff, impressions_period1, impressions_period2, impressions_diff, avg_position_period1, avg_position_period2, avg_position_diff FROM comparison ORDER BY url;

Now, tally it and bask the information either by exporting it into the CSV oregon Google Sheets.

How to tally  SQL successful  BigQueryHow to tally SQL successful BigQuery.

In lawsuit you person millions of URLs, you whitethorn not beryllium capable to enactment successful Google Sheets oregon CSV export due to the fact that the information is excessively big. Plus, those apps person limitations connected however galore rows you tin person successful a azygous document. In that case, you tin prevention results arsenic a BigQuery array and link to it with Looker Studio to presumption the data.

But delight retrieve that BigQuery is simply a freemium service. It is escaped up to 1 TB of processed query information a month. Once you transcend that limit, your recognition paper volition beryllium automatically charged based connected your usage.

That means if you link your BigQuery to Looker Studio and browse your information there, it volition number against your billing each clip you unfastened your Looker dashboard.

That is why, erstwhile exports person a fewer tens of thousands oregon hundreds of thousands of rows, I similar utilizing Google Sheets. I tin easy link it to Looker Studio for data visualization and blending, and this will not number against my billing.

If you person ChatGPT Plus, you tin simply usage this custom GPT I’ve made, which takes into relationship array schemas for GA4 and Search Console. In the supra guide, I assumed you were utilizing the escaped version, and it illustrated however you tin usage ChatGPT wide for moving BigQuery.

In lawsuit you privation to cognize what is successful that customized GPT, present is the screenshot of the backend.

Custom GPT with bigQuery array  schemasCustom GPT with BigQuery array schemas.

Nothing analyzable – you conscionable request to transcript tables from BigQuery arsenic JSON successful the step explained above and upload them into the customized GPT truthful it tin notation to the array structure. Additionally, determination is simply a punctual that asks GPT to notation to the JSON files attached erstwhile composing queries.

This is different illustration of however you tin usage ChatGPT to execute tasks much effectively, eliminating repetitive tasks.

If you request to enactment with different dataset (different from GA4 oregon GSC) and you don’t cognize SQL, you tin upload the array schema from BigQuery into ChatGPT and constitute SQLs circumstantial to that array structure. Easy, isn’t it?

As homework, I suggest you analyse which queries person been affected by AI Overviews.

There is nary differentiator successful the Google Search Console array to bash that, but you tin tally a query to spot which pages didn’t suffer ranking but had a important CTR driblet aft May 14, 2024, erstwhile Google introduced AI Overviews.

You tin comparison the two-week play aft May 14th with the 2 weeks prior. There is inactive a anticipation that the CTR driblet happened due to the fact that of different hunt features, similar a rival getting a Featured Snippet, but you should find capable valid cases wherever your clicks were affected by AI Overviews (formerly Search Generative Experience oregon “SGE”).

2. How To Combine Search Traffic Data With Engagement Metrics From GA4 

When analyzing hunt traffic, it is captious to recognize however overmuch users prosecute with contented due to the fact that idiosyncratic engagement signals are ranking factors. Please enactment that I don’t mean the nonstop metrics defined successful GA4.

However, GA4’s engagement metrics – specified arsenic “average engagement clip per session,” which is the mean clip your website was successful absorption successful a user’s browser – whitethorn hint astatine whether your articles are bully capable for users to read.

If it is excessively low, it means your blog pages whitethorn person an issue, and users don’t work them.

If you harvester that metric with Search Console data, you whitethorn find that pages with debased rankings besides person a debased mean engagement clip per session.

Please enactment that GA4 and GSC person antithetic attribution models. GA4 uses past data-driven oregon last-click attribution models, which means if 1 visits from Google to an nonfiction leafage erstwhile and past comes backmost straight 2 much times, GA4 whitethorn property each 3 visits to Google, whereas GSC volition study lone one.

So, it is not 100% close and whitethorn not beryllium suitable for firm reporting, but having engagement metrics from GA4 alongside GSC information provides invaluable accusation to analyse your rankings’ correlations with engagement.

Using ChatGPT with BigQuery requires a small preparation. Before we leap into the prompt, I suggest you work however GA4 tables are structured, arsenic it is not arsenic elemental arsenic GSC’s tables.

It has an event_params column, which has a grounds benignant and contains dimensions similar page_location, ga_session_id, and engagement_time_msec.  It tracks however agelong a idiosyncratic actively engages with your website.

event_params cardinal engagement_time_msec is not the full clip connected the tract but the clip spent connected circumstantial interactions (like clicking oregon scrolling), erstwhile each enactment adds a caller portion of engagement time. It is similar adding up each the small moments erstwhile users are actively utilizing your website oregon app.

Therefore, if we sum that metric and mean it crossed sessions for the pages, we get the mean engagement clip per session.

Now, erstwhile you recognize engagement_time_msec , let’s inquire ChatGPT to assistance america conception a query that pulls GA4 “average engagement clip per session” for each URL and combines it with GSC hunt show information of articles.

The punctual I would usage is:

Imagine you are a information expert experienced successful Google Analytics 4 (GA4), Google Search Console, SQL, and BigQuery. Compose a SQL query that pulls the pursuing information from Google Search Console for each URL for the erstwhile 7 days, excluding the existent day: 1. Clicks, 2. Impressions, 3. Average presumption (calculated arsenic the sum of positions divided by the sum of impressions). From GA4 BigQuery array unnest from event_params ga_session_id, engagement_time_msec and page_location. Select lone rows which person engagement_time_msec acceptable arsenic not null radical each sessions with the aforesaid IDs and page_location and SUM engagement_time_msec and devides to SUM of sessions fig Join GA4 and Google Search Console information by URLs for the aforesaid period. Additonally Optimize the query to propulsion from GA4's array partitions and not query the full table. Details: 1. BigQuery task name: use_your_bigquery_projectname 2. GA4 dataset name: use_your_ga4_bigquery_dataset_name 3. GA4 array name: use_your_ga4_bigquery_table_name_under_dataset 3. Google Search Console dataset name: use_your_gsc_bigquery_table_name_under_dataset 3. Google Search Console array name: use_your_gsc_bigquery_table_name_under_dataset 4. Here is BigQuery tables schemas for GA4: [copy array schema here] 5. Here is BigQuery tables schemas for Google Search Console: [copy array schema here]

Once I copied and pasted into BigQuery, it gave maine results with “average engagement clip per session” being each nulls. So, apparently, ChatGPT needs much discourse and guidance connected however GA4 works.

I’ve helped to supply additional knowledge arsenic a follow-up question from GA4’s authoritative documentation connected however it calculates engagement_time_msec. I copied and pasted the papers into the follow-up punctual and asked to notation to that cognition erstwhile composing the query, which helped. (If you get immoderate syntax error, conscionable copy/paste it arsenic a follow-up question and inquire to hole it.)

As a effect of 15 minutes of effort, I got the close SQL:

WITH -- Step 1: Filter GA4 information for the past 7 days and unnest event_params ga4_data AS ( SELECT event_date, event_timestamp, (SELECT value.int_value FROM UNNEST(event_params) WHERE cardinal = 'ga_session_id') AS ga_session_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE cardinal = 'engagement_time_msec') AS engagement_time_msec, (SELECT value.string_value FROM UNNEST(event_params) WHERE cardinal = 'page_location') AS page_location FROM `your_bq_project_name.your_bq_ga4_dataset.events_*` WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 8 DAY)) AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) AND (SELECT value.int_value FROM UNNEST(event_params) WHERE cardinal = 'engagement_time_msec') IS NOT NULL ), -- Step 2: Aggregate GA4 information by league ID and leafage location aggregated_ga4_data AS ( SELECT page_location, SUM(engagement_time_msec/1000) / NULLIF(COUNT(DISTINCT ga_session_id), 0) AS avg_engagement_time_msec FROM ga4_data GROUP BY page_location ), -- Step 3: Filter GSC information for the past 7 days and prime urls which had clicks gsc_data AS ( SELECT url, SUM(clicks) AS clicks, SUM(impressions) AS impressions, SUM(sum_position) / SUM(impressions) AS avg_position FROM `your_bq_project_name.searchconsole.searchdata_url_impression` WHERE data_date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 8 DAY) AND DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) and clicks > 0 GROUP BY url ) -- Joining Google Search Console information with GA4 information by page_location and url SELECT gsc.url, gsc.clicks, gsc.impressions, gsc.avg_position, ga4.avg_engagement_time_msec FROM gsc_data AS gsc LEFT JOIN aggregated_ga4_data AS ga4 ON gsc.url = ga4.page_location ORDER BY gsc.clicks DESC;

This pulls GSC information with engagement metrics from GA4.

Search Console combined information  with GA4Search Console combined information with GA4

Please enactment that you mightiness announcement discrepancies betwixt the numbers successful the GA4 UI and the information queried from BigQuery tables.

This happens due to the fact that GA4 focuses connected “Active Users” and groups uncommon information points into an “(other)” category, portion BigQuery shows each earthy data. GA4 besides uses modeled information for gaps erstwhile consent isn’t given, which BigQuery doesn’t include.

Additionally, GA4 whitethorn illustration information for quicker reports, whereas BigQuery includes each data. These variations mean GA4 offers a speedy overview, portion BigQuery provides elaborate analysis. Learn a much elaborate mentation of wherefore this happens in this article.

Perhaps you whitethorn effort modifying queries to see lone progressive users to bring results 1 measurement person to GA4 UI.

Alternatively, you tin usage Looker Studio to blend data, but it has limitations with precise ample datasets. BigQuery offers scalability by processing terabytes of information efficiently, making it perfect for large-scale SEO reports and elaborate analyses.

Its precocious SQL capabilities let analyzable queries for deeper insights that Looker Studio oregon different dashboarding tools cannot match.

Conclusion

Using ChatGPT’s coding abilities to constitute BigQuery queries for your reporting needs elevates you and opens caller horizons wherever you tin harvester aggregate sources of data.

This demonstrates however ChatGPT tin streamline analyzable information investigation tasks, enabling you to absorption connected strategical decision-making.

At the aforesaid time, these examples taught america that humans perfectly request to run AI chatbots due to the fact that they whitethorn hallucinate oregon nutrient incorrect answers.

More resources: 


Featured Image: NicoElNino/Shutterstock