Big Query

Updated by Kelsey Kearns

This page will guide you through enabling the BigQuery integration for the first time.

Requirements

In order to successfully set up your BigQuery integration, you will need:

  • A Google Cloud Platform (GCP) account
  • A project created with BigQuery resource enabled
  • Desired region for your bucket and BigQuery to live in
  • A user with admin access to the project’s BigQuery resource

Steps to Complete the Integration

  1. Create a BigQuery dataset in your desired project in Google Cloud Platform.

Note on multi-region support
While we can support multi-region datasets, GCP recommends colocating your BigQuery dataset within a single region for better performance. Please check Google documentation for further information.
  1. Open the Warehousing setup page. Select BigQuery and click Connect to BigQuery.
  2. Enter your dataset region, project ID, and dataset ID and click SetupBigQuery

  1. After initializing your setup we’ll provide you an email address to a unique service account. Click the icon to the left of the service account name to copy this to your clipboard.
  1. Go back into Google Cloud Platform and grant the Rockerbox service account access to your BigQuery dataset with the DataEditor role roles/bigquery.dataEditor. This enables the Rockerbox service account to create tables in your dataset.
  2. Go back to Rockerbox to grant your additional Google Cloud Platform accounts access to your Rockerbox data in BigQuery. These accounts can be a single user, group email, or a service account.
  3. Notify your Rockerbox Customer Success team that you completed steps #1-6. Due to a Big Query limitation, we have to sync the underlying datasets into BigQuery before you can create the data schemas in the next step. We will let you know when you are unblocked to proceed with setting up your tables.
  4. Select the Data Schemas you want Rockerbox to share with your BigQuery instance.
    • For Platform Data, select the platform(s) you want to sync over. Most clients will select all platforms they’re actively spending on.
    • For Rockerbox Data, first select the Conversion, and then the dataset to share. You can select multiple conversion-dataset combinations.
    • For both Platform Data and Rockerbox Data, you can update the default Table Name to something more descriptive.
    Once you’ve selected each dataset, be sure to click Sync this dataset to begin the share.

Backfilling Historical Data

For Platform Data, no additional steps are required for backfilling. You will be able to see historical data after setting up the share.

For Rockerbox Data and for each Conversion, Rockerbox will sync data back to the First Reporting Date. If you did not set a First Reporting Date for a Conversion, then Rockerbox will sync one day of data.

This backfill may take 24 hours to complete

BigQuery Table Limitations
The MTA and Conversions schemas have a limitation for BigQuery only. The columns with nested data url_parameters, utm_parameters, and additional_attributes will not automatically be updated with new column data. For example, if previously there are 5 url parameters and then a 6th is added, the 6th will not appear without manual intervention by Rockerbox. Please contact Rockerbox if there are missing columns in the url_parameters, utm_parameters, and additional_attributes fields that you need in your warehouse tables.


How did we do?