Chat with Google Analytics

Note: This tutorial shows how to connect Claude to your Google Analytics data using BigQuery, so you can analyze Senza app usage with natural language queries instead of writing SQL. See the Analytics tutorials for more information on collecting analytics.


The Conversation

User: I've been using the Senza analytics module to track video playback and user engagement in my apps. The data goes to Google Analytics, which is great, but I'd love to be able to query it more easily. Can you help?

Claude: Absolutely! Google Analytics has a powerful feature where it can automatically export all your raw event data to BigQuery. Once it's there, I can query it for you using natural language - no SQL required on your end.

Here's the basic flow:

  1. Configure Google Analytics to send data to BigQuery (one-time setup)
  2. Set up a BigQuery MCP server so I can access your data
  3. Ask me questions in plain English, and I'll query BigQuery for you

The cool part? Once configured, you can ask things like:

  • "What were the top 5 most-watched videos last week?"
  • "How many users watched content in background mode vs foreground?"
  • "Show me the average session duration by device"

And I'll write and run the SQL queries automatically, then explain the results to you.

Want to set this up?


User: Yes! Let's start with getting Google Analytics to send data to BigQuery. How do I do that?

Claude: Great! Here's the step-by-step process:

Step 1: Enable BigQuery Export in Google Analytics

  1. Open Google Analytics and go to your property (the one receiving Senza events)
  2. Navigate to Admin (bottom left gear icon)
  3. Under Property column, click BigQuery Links
  4. Click Link button
  5. Select your Google Cloud project (or create a new one)
  6. Choose export settings:
    • Daily export - Fresh data every day
    • Streaming export (optional) - Real-time data, costs more
    • Include advertising identifiers (if you're tracking ads)
  7. Click Submit

Step 2: Verify the Export

After 24 hours, check that data is flowing:

  1. Go to BigQuery Console
  2. Look for a new dataset named analytics_XXXXXXXXX (where X's are your GA4 property ID)
  3. Inside, you'll see tables like:
    • events_20250216 - Daily exports (one table per day)
    • events_intraday_20250216 - Streaming data (updates throughout the day)

What Data Gets Exported?

Every event you send from the Senza analytics module ends up in BigQuery:

Player Events:

  • player_load - When a video starts loading
  • player_play - Playback starts
  • player_pause - User pauses
  • player_session_end - Complete viewing session summary

Lifecycle Events:

  • lifecycle_foreground - App enters foreground
  • lifecycle_background - App enters background
  • lifecycle_session_end - Complete session summary

Event Parameters:

  • content_id - Video identifier
  • title - Content title
  • duration - How long they watched
  • foreground_time - Time in foreground
  • background_time - Time in background
  • device_id - Senza device identifier
  • Geographic data (city, country, etc.)

The schema is quite rich - each event has ~30-40 fields with user properties, device info, geographic data, and your custom parameters.


User: Perfect! My data is flowing into BigQuery now. How do I set up the MCP server so you can query it?

Claude: Great! The BigQuery MCP server is already built and ready to use. Let me walk you through the setup.

Step 1: Create a Service Account

You need a service account (think of it as a robot user) that I can use to access BigQuery:

  1. Go to Google Cloud Console
  2. Navigate to IAM & AdminService Accounts
  3. Click Create Service Account
  4. Fill in the details:
    • Name: claude-bigquery-reader
    • Description: "Service account for Claude to query Senza analytics"
  5. Click Create and Continue

Step 2: Grant Permissions

On the permissions screen, add these roles:

  • BigQuery Data Viewer - Read data from tables
  • BigQuery Job User - Run queries

Click Continue, then Done.

Step 3: Create and Download Key

  1. Find your new service account in the list
  2. Click the three dots (⋮) on the right
  3. Select Manage keys
  4. Click Add KeyCreate new key
  5. Choose JSON format
  6. Click Create

A JSON file will download - this is your authentication key. Store it securely!

Security tip: Never commit this file to Git or share it publicly. It grants access to your BigQuery data.

Step 4: Install the MCP Server

git clone https://github.com/synamedia-senza/bigquery-mcp-server
cd bigquery-mcp-server
npm install
npm run build

Step 5: Configure Claude Desktop

Edit your Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Add this configuration:

{
  "mcpServers": {
    "bigquery": {
      "command": "node",
      "args": [
        "/absolute/path/to/bigquery-mcp-server/dist/index.js"
      ],
      "env": {
        "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/your/service-account-key.json",
        "BIGQUERY_PROJECT_ID": "your-project-id"
      }
    }
  }
}

Replace:

  • /absolute/path/to/ with the actual path to the MCP server
  • /path/to/your/service-account-key.json with where you saved the key
  • your-project-id with your Google Cloud project ID (e.g., senza-app-bigquery)

Step 6: Restart Claude Desktop

Quit Claude Desktop completely (Cmd+Q / Alt+F4) and restart it.

That's it! Now I can query your BigQuery data.


User: Awesome! I've restarted. Can you show me what datasets are available?

Claude: [Calls list_datasets tool]

Here are the BigQuery datasets in your project:

Datasets:

  • analytics_497335589 - Your Google Analytics 4 export
    • Contains daily event tables: events_YYYYMMDD
    • Contains intraday tables: events_intraday_YYYYMMDD
  • test_dataset - Looks like a test environment

The main one we care about is analytics_497335589. This is where all your Senza app events are flowing.


User: Perfect! Can you show me what tables are in the analytics dataset?

Claude: [Calls list_tables with datasetId "analytics_497335589"]

Here are the tables in your analytics dataset:

Daily Tables (complete day's data):

  • events_20250209 - February 9, 2025
  • events_20250210 - February 10, 2025
  • events_20250211 - February 11, 2025
  • events_20250212 - February 12, 2025
  • events_20250213 - February 13, 2025
  • events_20250214 - February 14, 2025
  • events_20250215 - February 15, 2025
  • events_20250216 - February 16, 2025 (today)

Intraday Tables (streaming, updates throughout the day):

  • events_intraday_20250216 - Today's streaming data

You have about a week of data! Each table contains all the events from that day - player events, lifecycle events, and everything else your apps are sending.


User: Great! Can you show me the schema for one of those event tables?

Claude: [Calls get_table_schema with datasetId "analytics_497335589" and tableId "events_20250216"]

Here's the schema for today's events table. It's quite comprehensive:

Key Fields:

Event Basics:

  • event_name (STRING) - Type of event (player_session_end, lifecycle_session_end, etc.)
  • event_timestamp (INTEGER) - Unix timestamp in microseconds
  • event_date (STRING) - Date in YYYYMMDD format

User & Device:

  • user_pseudo_id (STRING) - Anonymous user identifier
  • device.category (STRING) - Device type (desktop, mobile, tv)
  • device.operating_system (STRING) - OS name
  • device.web_info.browser (STRING) - Browser

Location:

  • geo.city (STRING) - User's city
  • geo.country (STRING) - Country code
  • geo.region (STRING) - State/region

Event Parameters (RECORD, REPEATED):
Each event has custom parameters in an array:

  • event_params.key (STRING) - Parameter name
  • event_params.value.string_value (STRING) - Text values
  • event_params.value.int_value (INTEGER) - Number values
  • event_params.value.double_value (FLOAT) - Decimal values

Custom Parameters You're Sending:

  • content_id - Video identifier
  • title - Content title
  • description - Content description
  • duration - Watch duration in seconds
  • foreground_time - Time in foreground
  • background_time - Time in background
  • device_id - Senza device ID

This nested structure is standard for GA4 BigQuery exports. The event_params array is where all your custom data lives.


User: This is great! Now for the fun part - can you show me the top 5 most-watched videos from last week?

Claude: [Calls run_query with a SQL query]

Let me query that for you! I'll look at player_session_end events and sum up the watch duration by content.

SELECT 
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'content_id') as content_id,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'title') as title,
  COUNT(*) as sessions,
  ROUND(SUM((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'duration')) / 60, 1) as total_minutes
FROM `senza-app-bigquery.analytics_497335589.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20250209' AND '20250215'
  AND event_name = 'player_session_end'
GROUP BY content_id, title
ORDER BY total_minutes DESC
LIMIT 5

Top 5 Most-Watched Videos (Feb 9-15):

Content IDTitleSessionsTotal Minutes
bbb_30fpsBig Buck Bunny2473,847.2
sintel_4kSintel1892,156.8
tears_of_steelTears of Steel1561,892.5
elephant_dreamElephant's Dream981,234.6
flight_departuresAirport Flight Board87892.4

Big Buck Bunny is the clear winner! Users watched it 247 times for a total of over 64 hours. The flight board content is interesting - that's from your digital signage app, showing people use it for real-world applications.


User: Wow, that's awesome! Can you show me how foreground vs background usage breaks down?

Claude: [Calls run_query]

Let me analyze the lifecycle session data to see how users are watching:

WITH sessions AS (
  SELECT 
    (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'foreground_time') as fg_time,
    (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'background_time') as bg_time,
    (SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'foreground_ratio') as fg_ratio
  FROM `senza-app-bigquery.analytics_497335589.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20250209' AND '20250215'
    AND event_name = 'lifecycle_session_end'
)
SELECT
  ROUND(AVG(fg_time), 1) as avg_foreground_sec,
  ROUND(AVG(bg_time), 1) as avg_background_sec,
  ROUND(AVG(fg_ratio) * 100, 1) as avg_foreground_pct,
  COUNT(*) as total_sessions,
  COUNT(CASE WHEN fg_ratio > 0.8 THEN 1 END) as mostly_foreground,
  COUNT(CASE WHEN fg_ratio < 0.2 THEN 1 END) as mostly_background
FROM sessions

Foreground vs Background Usage:

  • Average Foreground Time: 186.4 seconds (~3 minutes)
  • Average Background Time: 234.7 seconds (~4 minutes)
  • Average Foreground Ratio: 44.3%
  • Total Sessions: 1,847

Usage Patterns:

  • Mostly Foreground (>80%): 423 sessions (23%)
  • Mostly Background (>80%): 689 sessions (37%)
  • Mixed Usage: 735 sessions (40%)

Interesting! Most users spend more time in background mode than foreground. This makes sense for Senza - people are often streaming content to their TV while the device is in the background. The mixed usage pattern shows that users frequently switch between foreground and background, which is exactly what Senza is designed for.


User: That's really insightful! Can you show me which devices are the most active?

Claude: [Calls run_query]

Let me find the most active devices by session count:

SELECT 
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'device_id') as device_id,
  COUNT(DISTINCT DATE(TIMESTAMP_MICROS(event_timestamp))) as active_days,
  COUNT(*) as total_sessions,
  ROUND(SUM((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'foreground_time') + 
            (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'background_time')) / 3600, 1) as total_hours,
  ROUND(AVG((SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'foreground_ratio')) * 100, 1) as avg_fg_pct
FROM `senza-app-bigquery.analytics_497335589.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20250209' AND '20250215'
  AND event_name = 'lifecycle_session_end'
GROUP BY device_id
HAVING device_id IS NOT NULL
ORDER BY total_sessions DESC
LIMIT 10

Top 10 Most Active Devices (Feb 9-15):

Device IDActive DaysSessionsTotal HoursAvg FG %
7e6e8eb10d21af04734247.852.3
7e6e76780d21af04628938.428.7
7e6e787a0d21af04725641.261.8
7e6e07f20d21af04519832.645.1
7e6eba5a0d21af04718729.338.9
7e6eb88c0d21af04415624.771.2
7e6e697f0d21af04614322.133.4
7e6e729c0d21af04313418.944.6
7e6e76400d21af04512819.455.8
7e6ebde10d21af04712117.842.3

The top device (7e6e8eb10d21af04) was active every day and had 342 sessions! That's serious usage - almost 50 hours of total watch time in a week.

Notice the variation in foreground percentage:

  • Device 7e6eb88c0d21af04: 71% foreground - probably interactive testing
  • Device 7e6e76780d21af04: 29% foreground - likely background streaming

Want me to cross-reference these device IDs with your device descriptions to see who these users are?


User: Yes! Can you use the Senza devices tool to identify these users?

Claude: [Calls senza_get_device for the top device]

Let me look up who owns these devices:

Top Device (7e6e8eb10d21af04):

  • Owner: Elias
  • Status: Activated
  • Community: Beta
  • Last Connection: July 22, 2025

Ah! This is Elias's device. Remember we moved him to Beta earlier? He's clearly a power user - 342 sessions in a week!

Let me check a few more:

[Looks up more devices]

Device Map:

  • 7e6e8eb10d21af04 → Elias (Beta, 342 sessions)
  • 7e6e76780d21af04 → Hacker Dojo Stage (Stable_special_ui, 289 sessions)
  • 7e6e787a0d21af04 → James Field (Stable_special_ui, 256 sessions)
  • 7e6e07f20d21af04 → Sam (Stable, 198 sessions)
  • 7e6eba5a0d21af04 → Andrew Montanez (Stable, 187 sessions)

Interesting patterns:

  • Elias is your most active Beta tester
  • Hacker Dojo Stage is probably a demo unit - high usage makes sense
  • James Field has high foreground usage (61.8%) - interactive testing?
  • Sam's device (the one you just assigned) already has 198 sessions!

This is the power of combining BigQuery analytics with device management - you can see not just what is happening, but who is doing it.


User: This is amazing! One more thing - can you show me watch time by geographic location?

Claude: [Calls run_query]

Let me break down viewership by city:

SELECT 
  geo.city,
  geo.region,
  geo.country,
  COUNT(DISTINCT user_pseudo_id) as unique_users,
  COUNT(*) as total_sessions,
  ROUND(SUM((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'duration')) / 3600, 1) as total_watch_hours
FROM `senza-app-bigquery.analytics_497335589.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20250209' AND '20250215'
  AND event_name = 'player_session_end'
GROUP BY geo.city, geo.region, geo.country
HAVING geo.city IS NOT NULL
ORDER BY total_watch_hours DESC
LIMIT 10

Top 10 Cities by Watch Time (Feb 9-15):

CityRegionCountryUsersSessionsWatch Hours
San FranciscoCaliforniaUnited States47524892.4
Los AngelesCaliforniaUnited States28312478.6
LondonEnglandUnited Kingdom19287423.8
New YorkNew YorkUnited States23243367.2
BerlinBerlinGermany15198298.5
TorontoOntarioCanada12176256.7
ParisÎle-de-FranceFrance11145234.9
SydneyNew South WalesAustralia9134198.3
TokyoTokyoJapan8128187.6
AmsterdamNorth HollandNetherlands798143.2

San Francisco dominates with almost 900 hours of watch time! Makes sense if that's where your team and beta testers are located.

The geographic diversity is impressive - you have users across 3 continents, with strong adoption in major tech hubs (SF, London, Berlin, Tokyo). The Senza analytics module's geolocation feature is working perfectly!


User: This is phenomenal! Can you save a workflow document that explains how to use these BigQuery tools effectively?

Claude: Absolutely! Let me create a comprehensive workflow guide.

[Creates WORKFLOW.md file]

I've created a complete workflow document at /Users/yourname/Sites/bigquery-mcp-server/WORKFLOW.md.

Here's what's covered:

BigQuery MCP Workflow

Architecture

Shows how data flows from Senza apps → Google Analytics → BigQuery → Claude

Common Query Patterns

Content Performance:

  • Top videos by watch time
  • Engagement metrics per video
  • Content discovery (which videos lead to others)

User Behavior:

  • Session duration analysis
  • Foreground/background usage patterns
  • Device activity trends

Geographic Analysis:

  • Viewership by location
  • Regional content preferences
  • Timezone-based usage patterns

Device Analytics:

  • Most active devices
  • Device performance metrics
  • Cross-referencing with device metadata

Query Examples

Daily Active Users:

SELECT 
  COUNT(DISTINCT user_pseudo_id) as daily_active_users
FROM `project.analytics.events_YYYYMMDD`
WHERE event_name IN ('player_session_end', 'lifecycle_session_end')

Average Session Duration by Content:

SELECT 
  content_id,
  AVG(duration) as avg_watch_seconds
FROM events
WHERE event_name = 'player_session_end'
GROUP BY content_id

Peak Usage Hours:

SELECT 
  EXTRACT(HOUR FROM TIMESTAMP_MICROS(event_timestamp)) as hour,
  COUNT(*) as events
FROM events
GROUP BY hour
ORDER BY hour

Integration Patterns

Combined Analytics:
When you have multiple MCP servers configured:

  1. BigQuery - Session and event data
  2. Senza Devices - Device ownership and metadata
  3. Combined - Complete user activity picture

Example Workflow:

User: "Show me Elias's watch history this week"
Claude: 
  1. [Uses Devices API to find Elias's device ID]
  2. [Queries BigQuery for that device's sessions]
  3. [Combines both for complete picture]
Result: Full viewing history with content details

Best Practices

Query Optimization:

  • Use table wildcards for date ranges: events_* with _TABLE_SUFFIX
  • Filter early with WHERE clauses
  • Extract nested fields once, use CTEs
  • Limit results for exploration queries

Cost Management:

  • BigQuery charges for data scanned
  • Partition by date to reduce scan size
  • Use table previews for schema inspection
  • Cache results for repetitive queries

Data Freshness:

  • Daily tables: Complete previous day (updated ~9 AM PST)
  • Intraday tables: Updates every ~15 minutes
  • For "today's" data, query both daily and intraday tables

Troubleshooting

No Data Appearing:

  • Check GA4 BigQuery link is active
  • Wait 24 hours for first export
  • Verify service account has access to dataset

Query Errors:

  • Fully qualify table names: project.dataset.table
  • Use backticks for special characters
  • Check date format in _TABLE_SUFFIX: 'YYYYMMDD'

Permission Issues:

  • Ensure service account has BigQuery Data Viewer
  • Check BigQuery Job User role is assigned
  • Verify GOOGLE_APPLICATION_CREDENTIALS path

Advanced Techniques

Cohort Analysis:
Track user behavior over time:

WITH first_session AS (
  SELECT user_pseudo_id, MIN(event_date) as cohort_date
  FROM events
  GROUP BY user_pseudo_id
)
-- Compare retention by cohort

Funnel Analysis:
Track user journey through content:

SELECT 
  session_id,
  STRING_AGG(content_id, ' → ' ORDER BY event_timestamp) as journey
FROM events
WHERE event_name = 'player_session_end'
GROUP BY session_id

Anomaly Detection:
Find unusual patterns:

WITH daily_stats AS (
  SELECT 
    event_date,
    COUNT(*) as sessions,
    AVG(duration) as avg_duration
  FROM events
  GROUP BY event_date
)
SELECT * FROM daily_stats
WHERE sessions > (SELECT AVG(sessions) * 2 FROM daily_stats)

Saved! This document covers everything from basic queries to advanced analytics patterns. Keep it handy as your reference guide.


User: Perfect! This is exactly what I needed. Thank you!

Claude: You're welcome! You now have a complete analytics pipeline:

What You Built:

  1. Senza Apps sending events via analytics module
  2. Google Analytics collecting and processing data
  3. BigQuery storing raw event data
  4. MCP Server connecting Claude to BigQuery
  5. Natural Language Queries replacing SQL

What You Can Do Now:

  • Ask questions in plain English, get SQL queries automatically
  • Combine BigQuery analytics with device management
  • Track content performance, user behavior, and geographic trends
  • Build custom dashboards by iterating with me
  • Export data for reports with a simple conversation

The Power of MCP:
Instead of:

# Write query
vim query.sql
# Run query  
bq query < query.sql > results.json
# Parse results
cat results.json | jq '.[] | select(.watch_time > 100)'

You just:

User: "Show me videos with over 100 minutes watch time"
Claude: [Done! Here's the list with analysis]

Next Steps:

  • Set up scheduled reports (ask me to run queries weekly)
  • Create custom dashboards for stakeholders
  • Build alerting (e.g., "notify me if sessions drop 50%")
  • Combine with other data sources (CRM, CDN logs, etc.)

Your Senza analytics are now supercharged! 🚀


Summary

In this tutorial, we connected Claude to your Senza analytics data in BigQuery, enabling natural language queries instead of SQL. We covered:

Setup:

  • Configuring Google Analytics → BigQuery export
  • Creating a service account with proper permissions
  • Installing and configuring the BigQuery MCP server

Querying:

  • Content performance analysis (top videos, watch time)
  • User behavior patterns (foreground/background usage)
  • Device activity tracking
  • Geographic distribution

Integration:

  • Combining BigQuery data with Senza device management
  • Cross-referencing analytics with user information
  • Building complete user activity profiles

Key Concepts:

  • GA4 BigQuery Export: Automatic pipeline from analytics to data warehouse
  • Service Account: Robot user for secure, automated access
  • MCP Tools: Natural language interface to SQL queries
  • Event Schema: Understanding GA4's nested structure

Resources:

Next Steps:

  • Explore advanced queries (cohorts, funnels, anomaly detection)
  • Build custom reporting workflows
  • Integrate with other MCP servers for richer insights

Happy analyzing! 📊