Warehousing: Getting Started - Redshift - Step 4
Select the Schema(s) to Share
In this step, you’ll select the Data Schemas you want Rockerbox to share with your Redshift instance.
- For Platform Data, select the platform(s) you want to sync over. Most clients will select all platforms they’re actively spending on.
- For Rockerbox Data, first select the Conversion, and then the dataset to share. You can select multiple conversion-dataset combinations.
- For both Platform Data and Rockerbox Data, you can update the default Table Name to something more descriptive.
Once you’ve selected each dataset, be sure to click Sync this dataset to begin the share.
Backfilling Historical Data
For Platform Data, no additional steps are required for backfilling. You will be able to see historical data after setting up the share.
For Rockerbox Data and for each Conversion, Rockerbox will sync data back to the First Reporting Date. If you did not set a First Reporting Date for a Conversion, then Rockerbox will sync one day of data.
Note: this backfill may take 24 hours to complete.
Enable Queries of JSON Data
Run the following to set
json_serialization_enable as true for each user:
ALTER USER [username] SET json_serialization_enable TO true;
When set to TRUE, the nested JSON in columns of type "dictionary" ("dict" in schema documentation) can be accessed with JSON functions.
- See Rockerbox schema documentation for the list of columns in each dataset that are type dictionary ("dict")
- Redshift documentation for more context about this setting.
You will see the following errors if this setting is not set to TRUE:
SELECT * FROM spectrum.customers LIMIT 1;
=> ERROR: Nested tables do not support '*' in the SELECT clause.
SELECT name FROM spectrum.customers LIMIT 1;
=> ERROR: column "name" does not exist in customers