Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.conversion.ai/llms.txt

Use this file to discover all available pages before exploring further.

This guide walks you through connecting Google BigQuery to Conversion and setting up your first sync.

Before You Begin

Make sure you have the following before starting:
  • Owner or IAM Admin access in your GCP project: You’ll need to create a service account, assign roles, and generate a key. The Owner, IAM Admin, or an equivalent role is required.
  • The dataset(s) you want to sync identified: Know which project, and optionally which datasets and tables, Conversion should be able to read from.
  • Your GCP project ID: Found in the Google Cloud Console next to the project name (e.g. my-company-prod). You’ll need it when connecting in Step 2.

Step 1: Set Up BigQuery Access

Conversion connects to BigQuery using a service account with a JSON key. In this step you’ll create a dedicated service account, grant it read access to your data, and (if needed) configure network access.

Create a Service Account

We recommend creating a dedicated service account for Conversion so its activity is easy to monitor and audit in isolation.
  1. In the Google Cloud Console, go to IAM & Admin → Service Accounts and create a new service account (e.g. conversion-sync).
  2. Grant it the BigQuery Job User and BigQuery User roles.
  3. Open the service account, go to the Keys tab, and create a new JSON key. Save the downloaded file; you’ll paste its contents into Conversion in Step 2.

Grant Read Access to Your Data

The service account also needs BigQuery Data Viewer to read your tables. We recommend granting it at the dataset level so Conversion’s access is scoped to exactly the data you want it to see — and so new tables added to those datasets are picked up automatically. For each dataset you want Conversion to read, open BigQuery, find the dataset in the Explorer panel, click the three dots → Share, and add your service account email (e.g. conversion-sync@my-project.iam.gserviceaccount.com) with the BigQuery Data Viewer role.
Conversion only sees datasets you explicitly share. New datasets need to be shared with the service account before Conversion can read from them.

Granting Data Viewer at the project level

If your project doesn’t contain data Conversion shouldn’t see, you can grant BigQuery Data Viewer at the project level instead, either during service account creation or later in IAM & Admin → IAM. New datasets are then picked up automatically as you add them.

Allow Conversion’s IP Addresses

If your BigQuery project uses VPC Service Controls or restricts access by IP, allow connections from Conversion’s IP addresses:
RegionIP Addresses
US35.239.90.161, 35.188.167.166, 34.56.101.43, 34.122.97.230, 34.29.176.66, 35.226.154.44
Most BigQuery projects don’t restrict by IP. If you don’t use VPC Service Controls or perimeter rules, you can skip this step.

Step 2: Connect BigQuery to Conversion

  1. In Conversion, go to Settings → CRM & Syncing → Connections.
  2. Click Add BigQuery connection.
  3. Fill in the connection details using the values from the earlier steps:
    FieldWhat to enterWhere it came from
    NameA friendly label, e.g. Production BigQueryFree-text, just for your reference in Conversion
    Project IDYour GCP project ID, e.g. my-company-prodFound in the Google Cloud Console next to the project name
    Service Account CredentialsThe full contents of your JSON key fileGenerated in Step 1
  4. Click Connect
Conversion will verify the connection. If successful, you’re ready to create syncs.

Step 3: Create a Sync

After connecting, open your new BigQuery connection and go to the Syncs tab to create a sync. Read more about Setting Up a Sync.

Writing Queries for BigQuery

A few things to keep in mind before writing your first query:
  • Name your columns to match Conversion fields. Conversion maps results to fields by column name, so each column in your SELECT clause must match the field key in Conversion exactly (e.g. email, first_name, userId, eventId). Use AS to alias when the source name doesn’t match.
  • Always filter by last_sync_time so each run only processes changed rows. In BigQuery, wrap it with TIMESTAMP_SECONDS(); see Using last_sync_time for details.
  • Select only the columns you’ll actually use, as every column scanned costs money in BigQuery.
Queries that return millions of rows should not run more frequently than once per day. Large, frequent syncs can impact both your BigQuery costs and Conversion performance.

Contact Sync

Provide userId (preferred), email, or both; at least one is required so Conversion can match the contact.
SELECT
  id AS userId,
  email,
  first_name,
  last_name,
  title AS job_title,
  phone,
  city
FROM `project.dataset.users`
WHERE updated_at >= TIMESTAMP_SECONDS({{last_sync_time}})

Custom Objects (Without Relationships)

To sync objects without relating them to contacts, omit the email column:
SELECT 
  id,
  name,
  status,
  plan_tier,
  monthly_price
FROM `project.dataset.subscriptions`
WHERE updated_at >= TIMESTAMP_SECONDS({{last_sync_time}})

Custom Objects with Relationships

To sync custom objects and their relationships to contacts in a single query, include the contact’s email and use JSON_OBJECT for relationship-specific fields:
SELECT 
  -- Object fields
  p.id,
  p.name,
  p.sku,
  p.category,
  p.price,
  
  -- Contact to relate this object to
  u.email,
  
  -- Relationship-specific fields
  JSON_OBJECT(
    'role', up.role,
    'quantity', up.quantity,
    'purchased_at', UNIX_SECONDS(up.purchased_at)
  ) AS relationshipFields
  
FROM `project.dataset.user_products` up
JOIN `project.dataset.products` p ON up.product_id = p.id
JOIN `project.dataset.users` u ON up.user_id = u.id
WHERE up.updated_at >= TIMESTAMP_SECONDS({{last_sync_time}})
In this example:
  • Object fields (id, name, sku, category, price) are stored on the Product object
  • email identifies which contact to create a relationship with
  • relationshipFields contains fields stored on the relationship itself (this contact’s role, quantity, etc.)
If multiple rows share the same object id but have different email values, Conversion creates one object with multiple relationships, one per contact.

Events

Events require a contact identifier (userId or email), event_name, and timestamp. eventId is optional but recommended. Conversion uses it to deduplicate events across runs.
SELECT
  CAST(u.id AS STRING) AS userId,
  u.email,
  'purchase_completed' AS event_name,
  UNIX_SECONDS(o.created_at) AS timestamp,
  CAST(o.id AS STRING) AS eventId,
  o.total_amount,
  o.product_name,
  o.currency
FROM `project.dataset.orders` o
JOIN `project.dataset.users` u ON o.user_id = u.id
WHERE o.created_at >= TIMESTAMP_SECONDS({{last_sync_time}})

BigQuery SQL Reference

Converting Timestamps

Conversion expects Unix timestamps for date/time fields. Use UNIX_SECONDS() to convert BigQuery TIMESTAMP or DATETIME values:
SELECT 
  email,
  UNIX_SECONDS(created_at) AS created_at,
  UNIX_SECONDS(last_login) AS last_login
FROM `project.dataset.users`
WHERE updated_at >= TIMESTAMP_SECONDS({{last_sync_time}})

Using last_sync_time

{{last_sync_time}} is a Unix timestamp in seconds (an integer). Conversion substitutes the integer value directly into your query at run time; do not wrap it in quotes. Use it in a WHERE clause to limit each run to rows that have changed since the last successful sync. On the first sync the value is 0, so every row matches. To compare it against a BigQuery TIMESTAMP column, convert the integer to a timestamp with TIMESTAMP_SECONDS():
WHERE updated_at >= TIMESTAMP_SECONDS({{last_sync_time}})
Or convert your column to Unix seconds and compare integers directly:
WHERE UNIX_SECONDS(updated_at) >= {{last_sync_time}}
To add a buffer that catches rows whose updated_at might be slightly stale, subtract seconds from the integer before converting:
WHERE updated_at >= TIMESTAMP_SECONDS({{last_sync_time}} - 300)  -- 5 minute buffer

Building Nested Objects with JSON_OBJECT

BigQuery doesn’t support dot notation for creating nested objects. Use JSON_OBJECT to build nested structures like relationshipFields:
JSON_OBJECT(
  'role', role_column,
  'quantity', quantity_column,
  'started_at', UNIX_SECONDS(started_at_column)
) AS relationshipFields

Handling NULLs

Use COALESCE or IFNULL to provide default values:
SELECT 
  email,
  COALESCE(first_name, '') AS first_name,
  IFNULL(phone, '') AS phone
FROM `project.dataset.users`

Casting Types

Use CAST to convert between types:
SELECT 
  CAST(user_id AS STRING) AS id,
  CAST(score AS INT64) AS lead_score
FROM `project.dataset.users`

Querying Across Datasets

Use fully-qualified table names to join across datasets:
SELECT 
  u.email,
  o.order_id
FROM `project.users_dataset.users` u
JOIN `project.orders_dataset.orders` o ON u.id = o.user_id

Troubleshooting

”Permission denied” errors

Ensure your service account has the required roles:
  • BigQuery Job User (project level)
  • BigQuery User (project level)
  • BigQuery Data Viewer (on each dataset you want to query)

“Could not connect” errors

Verify that:
  • The service account JSON key was pasted in full, including the surrounding { and }
  • Your project ID matches the project the service account belongs to
  • Conversion’s IP addresses are allowed if you use VPC Service Controls

Sync taking too long

  • Ensure you’re filtering by last_sync_time to reduce rows
  • Select only the columns you need
  • Use partitioned tables and filter on the partition column for time-based data
  • Reduce sync frequency for large datasets

Duplicate data appearing

If you’re using time-buffered queries (subtracting from last_sync_time to catch edge cases), Conversion handles duplicates automatically based on the unique key (userId or email for contacts, id for objects, eventId for events).

Frequently Asked Questions

The service account needs BigQuery Job User and BigQuery User at the project level, plus BigQuery Data Viewer on the datasets you want to sync. Conversion only reads data; it never writes to your BigQuery project.
BigQuery charges for the amount of data scanned by your queries. To minimize costs: select only the columns you need, filter with last_sync_time, and use partitioned tables where possible.
Use fully-qualified table names in your query:
SELECT u.email, o.order_id
FROM `my-project.users_dataset.users` u
JOIN `my-project.orders_dataset.orders` o ON u.id = o.user_id
Make sure BigQuery Data Viewer is granted on each dataset you reference.
If you sync a custom object with a userId or email that doesn’t exist in Conversion, we create the contact automatically with just that identifier. You can enrich that contact’s profile through a separate contacts sync.