Data Warehouse Sync - Conversion Docs

Data warehouse sync lets you bring data from your data warehouse into Conversion. Sync contacts, custom objects, relationships, and events directly from tables and views in your warehouse, keeping Conversion in sync with your source of truth.

Why Sync from Your Data Warehouse?

Your data warehouse is often the most complete and accurate source of customer data:

Use your existing data models: Query the tables you already have instead of restructuring data for Conversion
Keep data fresh automatically: Schedule syncs to run on an interval, ensuring Conversion always has current data
Sync any data type: Bring in contacts, companies, custom objects, relationships, and events from a single connection

Supported Warehouses

Conversion supports the following data warehouses:

BigQuery

Google BigQuery

Snowflake

Snowflake Data Cloud

Databricks

Databricks Storage

Redshift

Amazon Redshift

Key Concepts

Before setting up your connection, it helps to understand a few key concepts.

Syncs

A sync defines what data to pull from your warehouse and how to map it to Conversion. Each sync includes:

A destination type: What kind of data you’re syncing (contacts, custom objects, or events)
A SQL query: The query that returns the data you want to sync
A schedule: How often the sync runs

You’ll typically create multiple syncs for a single warehouse connection: one for contacts, one for each custom object type, etc.

Field Mapping

Conversion maps the columns returned by your query to fields in Conversion by column name. The name (or AS alias) of each column in your SELECT clause must match the field key in Conversion exactly, including case.

SELECT
  email,                       -- maps to the `email` field
  first_name,                  -- maps to the `first_name` field
  title AS job_title,          -- aliased to map to `job_title`
  CAST(score AS INT) AS lead_score
FROM users

If a column doesn’t match a known field on the destination, it’s ignored. Use AS to rename columns whenever your source name doesn’t match the Conversion field key.

Incremental Syncing

To avoid re-syncing your entire dataset on every run, syncs support incremental syncing using a timestamp column. Conversion exposes a last_sync_time variable you reference in your query to only select rows that have changed since the last sync. last_sync_time is a Unix timestamp in seconds (an integer) that you compare against a timestamp column in your source. On the first run it’s 0 (Unix epoch), so every row matches. On subsequent runs it’s the start time of the previous successful sync.

SELECT id, email, first_name, last_name, updated_at
FROM users
WHERE updated_at > TO_TIMESTAMP({{last_sync_time}})

Because it’s an integer, don’t wrap {{last_sync_time}} in quotes. Most warehouses need a small conversion to compare it against a TIMESTAMP column; see the warehouse-specific guides for exact syntax (TIMESTAMP_SECONDS() in BigQuery, TO_TIMESTAMP() in Snowflake, etc.).

We strongly recommend including a WHERE clause using last_sync_time in every sync. Without it, you’ll re-sync your entire dataset on every interval, which is slower and more expensive.

Unique Keys

Each sync requires a unique key that identifies records. This key determines how Conversion matches incoming rows to existing records:

Destination Type	Unique Key
Contacts	`userId` or `email`
Custom Objects	Object `id`
Events	`eventId` (optional, used for deduplication)

What You Can Sync

Contacts

Sync contacts from your warehouse to create or update contact records in Conversion. Required columns (at least one of):

email: The contact’s email address. To create new contacts in Conversion, email address must be provided.
userId: Your system’s unique identifier for the contact.

If both are provided, userId is used to match the contact and email is stored on the contact record. Optional columns:

Any contact field (e.g., first_name, last_name, job_title, phone)
Custom field keys you’ve defined in Conversion

SELECT
  id AS userId,
  email,
  first_name,
  last_name,
  title AS job_title,
  phone,
  city
FROM users
WHERE updated_at > TO_TIMESTAMP({{last_sync_time}})

Custom Objects

Sync custom objects to create or update records of a specific custom object type. Objects are matched by their ID. Required columns:

id: Your unique identifier for this object

Optional columns:

Any field defined on the custom object type

SELECT
  id,
  name,
  status,
  plan_tier,
  renewal_date
FROM subscriptions
WHERE updated_at > TO_TIMESTAMP({{last_sync_time}})

You must create the custom object type in Conversion before setting up a sync. The sync will create and update individual objects of that type.

Custom Objects with Relationships

To connect a custom object to a contact, include the contact’s identifier and any relationship-specific fields in your query. This creates both the object and its relationship to the contact in a single sync. Required columns:

id: Your unique identifier for this object
userId or email: The contact to relate this object to (userId is preferred when available)

Optional columns:

Any field defined on the custom object type
relationshipFields: A JSON object containing relationship-specific fields

SELECT
  p.id,
  p.name,
  p.sku,
  p.category,
  u.email,
  JSON_OBJECT(
    'role', up.role,
    'quantity', up.quantity,
    'purchased_at', up.purchased_at
  ) AS relationshipFields
FROM user_products up
JOIN products p ON up.product_id = p.id
JOIN users u ON up.user_id = u.id
WHERE up.updated_at > TO_TIMESTAMP({{last_sync_time}})

In this example:

Object fields (id, name, sku, category) create or update the Product object
userId or email identifies which contact to relate the object to
relationshipFields contains fields specific to this contact’s relationship with the object (their role, quantity purchased, etc.)

If multiple rows have the same object id but different contact identifiers, Conversion creates one object with multiple relationships, one for each contact.

Events

Sync events to record activity on contacts. Events appear on the contact’s timeline and can trigger workflows. Required columns:

userId or email: Identifies the contact the event belongs to (same rules as contacts: userId is preferred when available)
event_name: The name of the event
timestamp: When the event occurred

Optional columns:

Any event property
eventId: A unique identifier for the event, used for deduplication

SELECT
  u.id AS userId,
  u.email,
  'purchase_completed' AS event_name,
  o.created_at AS timestamp,
  o.id AS eventId,
  o.total_amount,
  o.product_name,
  o.currency
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.created_at > TO_TIMESTAMP({{last_sync_time}})

eventId is optional but strongly recommended. We use it to deduplicate events. If the same eventId is synced twice (for example, when using a time buffer), we’ll ignore the duplicate.

Setting Up a Sync

1. Connect Your Warehouse

In Conversion, go to Settings → CRM & Syncing → Connections and add a new connection for your warehouse type. You’ll need to provide connection credentials; the specific requirements vary by warehouse. See the warehouse-specific guides for the exact steps.

We recommend creating a dedicated read-only user for Conversion. This limits access to only the tables you need and makes it easy to revoke access if needed.

2. Create a Sync

After your warehouse connection is verified, open it and go to the Syncs tab to create a sync:

Click Add Sync
Select the destination type (Contacts, Custom Objects, or Events)
For Custom Objects, select the specific object type
Enter your SQL query
Set the sync schedule
Enable the sync

3. Set the Schedule

Choose how often your sync runs:

Frequency	Best For
Every 15 minutes	High-frequency data like events or rapidly changing attributes
Every hour	Most contact syncs
Every 6 hours	Large datasets or slowly changing data
Daily	Historical data or very large syncs

If a sync is still running when the next scheduled run begins, we’ll skip that run. Choose a frequency that gives your query enough time to complete.

Incremental Sync Strategies

Using last_sync_time

The simplest approach is to filter on an updated_at column. Convert {{last_sync_time}} (a Unix timestamp in seconds) to your warehouse’s timestamp type so the comparison works:

SELECT id, email, first_name, last_name
FROM contacts
WHERE updated_at > TO_TIMESTAMP({{last_sync_time}})

The exact conversion function depends on your warehouse; see each warehouse guide for the right syntax.

Time-Buffered Queries

If your updated_at timestamps aren’t perfectly reliable, you can add a buffer to catch rows that might have been missed:

SELECT id, email, first_name, last_name
FROM contacts
WHERE updated_at > TO_TIMESTAMP({{last_sync_time}} - 300)  -- 5 minute buffer

When using time-buffered queries, include an eventId column (for events) or rely on the unique key to handle duplicates. Conversion will update existing records rather than creating duplicates.

Full Syncs

For small tables or when you need to ensure complete accuracy, you can omit the WHERE clause to sync all rows on every run. This is less efficient but guarantees no rows are missed.

Monitoring and Troubleshooting

Sync History

View the history of each sync in the Syncs tab. Each run shows:

Start and end time
Number of rows processed
Number of successful and failed operations
Error details (if any)

Common Errors

Error	Cause	Solution
Missing required field	A row is missing a required column (e.g., neither `userId` nor `email` for contacts)	Update your query to ensure required columns are never null
Invalid field value	A value doesn’t match the expected type (e.g., text in a number field)	Cast values to the correct type in your query
Object type not found	Syncing to a custom object type that doesn’t exist	Create the custom object type in Conversion first
Connection failed	Conversion can’t reach your warehouse	Check your network settings and credentials

Frequently Asked Questions

What happens if a sync fails?

If a sync fails, we’ll retry on the next scheduled interval. Partial failures (where some rows succeed and others fail) are logged, and you can download error details from the sync history. Failed rows won’t affect successful rows.

Can I sync the same data to multiple destinations?

Yes. You can create multiple syncs from the same warehouse, each with its own query and destination. This is common when you need to sync contacts and their related custom objects from different tables.

Can I transform data in my query?

Yes. Use SQL functions to transform data before it reaches Conversion. For example, you can concatenate fields, convert types, or apply conditional logic.

What timezone are timestamps in?

last_sync_time is a Unix timestamp in seconds (UTC). Your query should compare against UTC timestamps. If your data uses a different timezone, convert it in your query.

Can I sync objects without relating them to contacts?

Yes. If your query doesn’t include a userId or email column, the objects are created without any relationships. You can add relationships later through a separate sync or via the API.

What if a contact doesn't exist yet?

If you sync a custom object with a userId or email that doesn’t exist in Conversion, we’ll create the contact automatically with just that identifier. You can enrich the contact’s profile through a separate contacts sync.

​Why Sync from Your Data Warehouse?

​Supported Warehouses

BigQuery

Snowflake

Databricks

Redshift

​Key Concepts

​Syncs

​Field Mapping

​Incremental Syncing

​Unique Keys

​What You Can Sync

​Contacts

​Custom Objects

​Custom Objects with Relationships

​Events

​Setting Up a Sync

​1. Connect Your Warehouse

​2. Create a Sync

​3. Set the Schedule

​Incremental Sync Strategies

​Using last_sync_time

​Time-Buffered Queries

​Full Syncs

​Monitoring and Troubleshooting

​Sync History

​Common Errors

​Frequently Asked Questions

Why Sync from Your Data Warehouse?

Supported Warehouses

Key Concepts

Syncs

Field Mapping

Incremental Syncing

Unique Keys

What You Can Sync

Contacts

Custom Objects

Custom Objects with Relationships

Events

Setting Up a Sync

1. Connect Your Warehouse

2. Create a Sync

3. Set the Schedule

Incremental Sync Strategies

Using last_sync_time

Time-Buffered Queries

Full Syncs

Monitoring and Troubleshooting

Sync History

Common Errors

Frequently Asked Questions