Export Dotdigital events to Snowflake using data firehose

Sync Dotdigital engagement events to cloud storage using Data Firehose and ingest them into Snowflake for analytics. This guide covers Firehose setup, storage configuration, Snowflake ingestion options, and validation.

Overview

To continuously export events to cloud object storage, use Data Firehose and load those files into Snowflake.

High-level flow

High-level flow

What you get

  • Continuous sync of engagement events including email, SMS, transactional, push, and some forms-based events.
    Learn more in Data Firehose event schema.
  • Near-time or scheduled loading into Snowflake (every 15 minutes, hourly, or daily).
    Learn more in Set up a data firehose.
  • Control over security, retention, and transformations.

Before you start

Things you need to know:

  • Dotdigital and Snowflake access
    You must have access to Dotdigital with the CXDP package and Snowflake environments, plus correct permissions for storage and ingestion.
  • Cloud storage
    • Amazon S3, Azure Blob Storage, or Google Cloud Storage (GCS).
    • A dedicated bucket/container and path, for example:
      s3://my-dd-events/prod/
    • Write permissions to that path.
  • Snowflake
    • A role with rights to create stages, pipes, file formats, and tables.
    • Network/security access to the chosen storage, for example:
      IAM role for S3, SAS for Azure, service account for GCS.

Data Firehose: Configure the sync

  1. In Dotdigital, go to Connect > Data firehose.
  2. Choose your destination type:
    • Amazon S3 – enter bucket, region, folder path, access key/secret.
    • Azure Blob – enter container, folder path, and choose authentication type:
    • Google Cloud Storage – requires bucket name, folder path, and service account details (JSON key).
    • SFTP/FTPS – alternative when object storage is not available.
  3. Select event types, for example: Email opens, clicks, sends, bounces, SMS events.
  4. Choose frequency: every 15 minutes, hourly, or daily.
  5. Set sync failure notifications: Email, in-app, both, or none.
  6. Confirm and start the sync.

Check Connect > Data firehose > Report for status and failures.


Data format and schema

  • Format – Data Firehose outputs CSV files.
  • Schema – See Dotdigital Data Firehose event schema for field names, types, and per-event definitions.

Snowflake: Ingest from object storage

You can load files continuously with Snowpipe (auto-ingest) or in batches with COPY INTO.

For full instructions and the latest syntax examples, see Snowflake’s official documentation:

  • Intro to Snowpipe
  • Auto-ingest Snowpipe with Amazon S3
  • Auto-ingest Snowpipe with Azure Blob Storage
  • Auto-ingest Snowpipe with Google Cloud Storage
  • Bulk loading using COPY INTO

1. Create the target tables

Connect Snowflake to your storage and define how files are loaded. Choose Snowpipe for automation or COPY INTO for scheduled batch loads.

Data Firehose batches exported events into files of 2,000 rows each.
For example, an email campaign sent to 10,000 contacts will produce five CSV files, each typically ~300 KB in size, and always under 500 KB.

Files are placed under the folder path specified during Firehose setup, organised by event type, for example:
your-example-catalog/emailsends/emailsends_YYYY-MM-DDTHH-MM-SS.sss_0.csv

Batch suffixes (_0, _1, _2) indicate file order; _0 is always present even if only one file is created.

2. Create a file format

Create a file format in Snowflake that matches your Firehose output format (CSV or JSON).
See Snowflake CREATE FILE FORMAT.

3. Create an external stage

Use the storage provider that matches your Firehose destination.

4. Choose ingestion method

Snowpipe (auto-ingest, near real-time)

  • Create the Pipe to load new files into your landing table.
  • Configure the storage to notify Snowflake on new files (S3/SQS, Azure/Event Grid, GCS/PubSub).

COPY INTO (batch)

  • Run on a schedule using Snowflake TASK or your orchestrator.

Security, networking, and reliability

Dotdigital encrypts all credential secrets in its database. Data files are transferred over secure connections (HTTPS for cloud object storage, SFTP/FTPS for servers).
If encryption at rest is required, configure it on your chosen cloud storage provider, for example: SSE-S3 or SSE-KMS for AWS, Azure Storage encryption, GCS default encryption.

Ensure:

  • Dotdigital can write to the target folder and create directories.
  • Any IP allow-listing includes Dotdigital outbound IPs.
  • Monitoring is configured via Firehose reports and Snowflake’s LOAD_HISTORY where applicable.

Test and validate the pipeline

Test the ingestion by running a SELECT query on your target landing table, for example:

SELECT * FROM my_landing_table LIMIT 10;

to confirm rows appear.