Export Dotdigital events to Snowflake using data firehose
Sync Dotdigital engagement events to cloud storage using Data Firehose and ingest them into Snowflake for analytics. This guide covers Firehose setup, storage configuration, Snowflake ingestion options, and validation.
Overview
To continuously export events to cloud object storage, use Data Firehose and load those files into Snowflake.
High-level flow
What you get
- Continuous sync of engagement events including email, SMS, transactional, push, and some forms-based events.
Learn more in Data Firehose event schema. - Near-time or scheduled loading into Snowflake (every 15 minutes, hourly, or daily).
Learn more in Set up a data firehose. - Control over security, retention, and transformations.
Before you start
Things you need to know:
- Dotdigital and Snowflake access
You must have access to Dotdigital with the CXDP package and Snowflake environments, plus correct permissions for storage and ingestion. - Cloud storage
- Amazon S3, Azure Blob Storage, or Google Cloud Storage (GCS).
- A dedicated bucket/container and path, for example:
s3://my-dd-events/prod/ - Write permissions to that path.
- Snowflake
- A role with rights to create stages, pipes, file formats, and tables.
- Network/security access to the chosen storage, for example:
IAM role for S3, SAS for Azure, service account for GCS.
Data Firehose: Configure the sync
- In Dotdigital, go to Connect > Data firehose.
- Choose your destination type:
- Amazon S3 – enter bucket, region, folder path, access key/secret.
- Azure Blob – enter container, folder path, and choose authentication type:
- SAS Token or OAuth 2.0 (auth requirements differ by method).
Learn more in Add data Firehose configuration.
- SAS Token or OAuth 2.0 (auth requirements differ by method).
- Google Cloud Storage – requires bucket name, folder path, and service account details (JSON key).
- SFTP/FTPS – alternative when object storage is not available.
- Select event types, for example: Email opens, clicks, sends, bounces, SMS events.
- Choose frequency: every 15 minutes, hourly, or daily.
- Set sync failure notifications: Email, in-app, both, or none.
- Confirm and start the sync.
Check Connect > Data firehose > Report for status and failures.
Data format and schema
- Format – Data Firehose outputs CSV files.
- Schema – See Dotdigital Data Firehose event schema for field names, types, and per-event definitions.
Snowflake: Ingest from object storage
You can load files continuously with Snowpipe (auto-ingest) or in batches with COPY INTO.
For full instructions and the latest syntax examples, see Snowflake’s official documentation:
- Intro to Snowpipe
- Auto-ingest Snowpipe with Amazon S3
- Auto-ingest Snowpipe with Azure Blob Storage
- Auto-ingest Snowpipe with Google Cloud Storage
- Bulk loading using COPY INTO
1. Create the target tables
Connect Snowflake to your storage and define how files are loaded. Choose Snowpipe for automation or COPY INTO for scheduled batch loads.
Data Firehose batches exported events into files of 2,000 rows each.
For example, an email campaign sent to 10,000 contacts will produce five CSV files, each typically ~300 KB in size, and always under 500 KB.
Files are placed under the folder path specified during Firehose setup, organised by event type, for example:
your-example-catalog/emailsends/emailsends_YYYY-MM-DDTHH-MM-SS.sss_0.csv
Batch suffixes (_0, _1, _2) indicate file order; _0 is always present even if only one file is created.
2. Create a file format
Create a file format in Snowflake that matches your Firehose output format (CSV or JSON).
See Snowflake CREATE FILE FORMAT.
3. Create an external stage
Use the storage provider that matches your Firehose destination.
4. Choose ingestion method
Snowpipe (auto-ingest, near real-time)
- Create the Pipe to load new files into your landing table.
- Configure the storage to notify Snowflake on new files (S3/SQS, Azure/Event Grid, GCS/PubSub).
COPY INTO (batch)
- Run on a schedule using Snowflake TASK or your orchestrator.
Security, networking, and reliability
Dotdigital encrypts all credential secrets in its database. Data files are transferred over secure connections (HTTPS for cloud object storage, SFTP/FTPS for servers).
If encryption at rest is required, configure it on your chosen cloud storage provider, for example: SSE-S3 or SSE-KMS for AWS, Azure Storage encryption, GCS default encryption.
Ensure:
- Dotdigital can write to the target folder and create directories.
- Any IP allow-listing includes Dotdigital outbound IPs.
- Monitoring is configured via Firehose reports and Snowflake’s
LOAD_HISTORYwhere applicable.
Test and validate the pipeline
Test the ingestion by running a SELECT query on your target landing table, for example:
SELECT * FROM my_landing_table LIMIT 10;to confirm rows appear.
Updated about 10 hours ago
