Insight data

What is Insight data?

Insight data is storable information related to your customers’ purchasing histories and habits. It allows you to store collections of data against your contacts or account using our API.

So they’re additional contact data fields, extending the contact data fields we already store?

Yes – but with the prime difference being these are transaction-based ones. Any additional keyed data you have for a contact can be stored as Insight data, with very little restriction on the type of data it can be. As a broad rule, you can store anything that is serializable to JSON.

Just as you can currently segment upon contact data fields such as gender, age and geography, Insight data gives you the ability to segment upon types of items purchased, the regularity of purchases and amount spent on purchases, for instance. As ever, the quality of your Insight data dictates how well you can segment your contacts and personalize your campaigns and offers.

Dataset examples

Here are some examples of datasets that could be stored as Insight data:

For an auction site, a list of all bids a user has made and if they succeeded or not, including the date and time of the bid, the amount of the bid and the category and name of the item bid on.
For a travel agent, a list of destinations a user has visited via bookings on the site, including the number of bookings for the destination (solo, couple, family, etc.,), the amount spent and the type of accommodation.
A list of a user’s likes and dislikes.

The key point is that many different types of Insight data can be stored for each contact, and structured in a way of your choosing.

What can I do with this data?

Once Insight data is stored against your contacts, you can use it to write queries to segment your contacts against this data and create new lists. This enables you to send targeted, personalized campaigns based upon your contacts’ transactional history and habits. Using the above examples, you can run segmentations based on bids made, countries visited and likes and dislikes.

For more on segmenting Insight data, read our article on Insight data segmentation.

How is Insight data stored?

This section provides information and guidelines on storing your Insight data. This is a technical guide, so if you want a non-developers account of Insight data, check out this article.

Record Ids (keys) are used to refer to individual pieces of data

At its heart, Insight data is a key-value pair storage mechanism; you store a piece of data against a contact, using a recordId (key) to refer to it. If you need to update or delete a piece of data at a later date, you use the recordId (key) as a reference to retrieve the piece of data and alter it.

Restrictions on Record Ids (keys)

recordId's must:

Be unique (it is up to the user of the API to ensure uniqueness)

🚧
Keys must be unique across all contacts
Even if you use a contact scoped collection keys must be unique across all contacts. If you try and add a key that already is in use to another contact then the key will be transferred to the new contacts and overwritten with the new record!

recordId's must not:

Exceed 255 characters in length

Data is stored against a contact or as account-scoped data

A contact identifier (email address, mobile number or ContactID) is submitted with each Insight data document uploaded. If the identifier used is "account" rather than an email address, then the record is stored as account-scoped Insight data. This is Insight data not linked to a specific contact record, for example, a product catalog.

JSON data representation

All data to be stored must be serializable as JSON. This means complex object structures can be used to represent data. For example, you could represent the contents of a user’s shopping basket like this:

{  
   "basketId": 4858,  
   "datePurchased": "2013-03-19T17:22:54.8042",  
   "productPurchased": [{  
           "name": "a dvd",  
           "cost": 10.99,  
           "isOnSale": false  
       }, {  
           "name": "a book",  
           "cost": 2.99,  
           "isOnSale": true  
       }  
    ]  
}

There are a few restrictions and extensions to the JSON format as outlined below.

Restrictions on collection names

Valid collection names can only contain the following characters:

Alphanumeric (a-z, A-Z, 0-9)
Hyphen ("-")
Underscore ("_")

Furthermore, collection names can't:

Begin with a number
Exceed 255 characters in length
Have exactly the same name as any other collection, even if the collections are differently scoped (e.g. one collection is contact-scoped and another collection is account-scoped)

Restrictions and extensions on values

String values are restricted to 1000 characters in length.
Date time values are also supported via a subset of the ISO 8601 standard (e.g. 2013-03-19T17:22:55.804Z)
Lists must be of the same type. Lists may only contain either Booleans, strings, numbers, objects, lists, etc.

See the JSON specification for all other value types.

📘
A note on date time formatting
Not all ISO 8601 formats are supported (for example the string "2023" would not be treated as a date). The following are valid for this implementation.
Complete date plus hours, minutes and seconds:  
YYYY-MM-DDThh:mm:ssTZD (e.g. 1997-07-16T19:20:30+01:00)  

Complete date plus hours, minutes, seconds and a decimal fraction of a second  
YYYY-MM-DDThh:mm:ss.sTZD (e.g. 1997-07-16T19:20:30.45+01:00)
where:
 YYYY = four-digit year
 MM   = two-digit month (01=January, etc.)
 DD   = two-digit day of month (01 through 31)
 hh   = two digits of hour (00 through 23) (am/pm NOT allowed)
 mm   = two digits of minute (00 through 59)
 ss   = two digits of second (00 through 59)
 s    = one or more digits representing a decimal fraction of a second
 TZD  = time zone designator (Z or +hh:mm or -hh:mm)
Note: All times without a time zone designator will be treated as UTC.
Retention policy
A retention policy is available for any insight data collection which includes a date type field in its records. The retention policy is a tool that allows you to expire insight data records after a period of time that you set. Expiring insight data means deleting it from your account. This tool can be helpful in managing your insight data allowance.
For a date field to be available to use for the purposes of setting a retention policy, it must be included in the root (top level) of the JSON structure.

JSON stores data in schema-bound collections

While you can store any JSON structure as Insight data, it is not strictly schemaless. Insight data requires that you store similarly structured data in a collection. Each collection may have a different schema, allowing you to store a variety of different data against contacts.

The schema of a collection is defined ‘on the fly’

Schemas are defined implicitly upon uploading data via the API for the first time. Furthermore, schemas are extendable but not editable.

For example, let’s say you’ve just uploaded the following object into a collection you’ve created called preferences:

{  
   "likes": {  
       "animals": [  
          {  
             "animal": "dogs"  
          },  
          {  
             "animal":"cats"  
          }  
       ],  
       "numbers": [  
          {  
             "number": 1,  
          },  
          {  
             "number": 2,  
          },  
          {  
             "number": 7  
          }  
       ]  
   },  
   "dislikes": {  
       "animals": [  
           {  
               "animal": "rats"  
           }  
        ],  
       "numbers": [13]

   }  
}

For all proceeding uploads, the preferences collection will expect an object with two properties, likes and dislikes, each of which refer to an object where each object has a property called animals which is a list of strings and numbers which is a list of numbers.

Types are immutable

Once you’ve defined a named property, its type can’t be changed. For example, if you upload this JSON object to define your schema for a collection:

{  
   "id": "1",  
   "name": "Tom Bloggs"  
}

You couldn’t then upload the following object:

{  
   "id": 2,  
   "name": {  
       "first": "john",  
       "second": "smith"  
   }  
}

Why? Because the id property has been changed from a string to a number and the name property has been changed from a string to an object.

Objects are extendable

The only things that are mutable are objects because they are extendable. Properties can be added to an object at any time (however they are then bound by the above rule). For example, if this object was uploaded to define a collection’s schema:

{  
   "fullName": "John Bloggs"  
}

Then uploading this object would extend the collection's schema to include the two new properties, firstName and lastName:

{  
   "fullName": "John Bloggs"  
   "firstName": "John",  
   "lastName": "Bloggs"  
}

📘
Get the schema for an Insight data collection
If you are having trouble inserting new records into a collection due to schema mismatches you can:

Retrieve the schema using the get insight data collection schema call, and study the schema to see where your new record differs.

Empty the collection using the empty insight data collection call, which empties all records and resets the schema. You can then start again and define the schema using the first record you add to the collection.

Bulk import Insight data

We strongly encourage the use of our bulk calls to add Insight data to Dotdigital, as bulk calls can be many factors more efficient than making individual calls.

Import using the API

The most efficient pattern is to pass your data to us in the fewest calls to bulk import Insight data to add or update insight data to either an account or contacts. To achieve this it is important to batch your data within your system and then send it to us. You can send a request up to 50MB, so many records can be sent in a single call.

You can perform multiple bulk imports at a time for an account. Typically we'd suggest 5 as a maximum, for optimal speed and performance. You should then poll periodically (recommend every 5 seconds) to check on the status of your bulk import using the retrieve Insight data import status call passing the import ID the bulk import Insight data returned. When imports complete you can then add more data using the bulk import Insight data call.

🚧
Ensure contacts already exist to successfully bulk add transactional data to them
You need to ensure that the contacts you're bulk adding transactional data to already exist in the account. Contacts aren't created in this operation.

Import using the app

To import using the app, check out the article Import insight data.

🚧
Plan ahead
Given the nature of Insight data storage, you should spend some time thinking about how your Insight data needs to work for you. A little bit of planning goes a long way: what sort of data do you want to store? How might you want to extend this data as you move forward?
The point to remember is that you don’t want to become backed into a corner by initiating a schema that later proves constrictive. For example, you probably want to avoid entering your product ID as an integer. This won’t give you very easily identifiable information and you won’t be able to change it once it is uploaded. To change it, you would need to start again with a new schema and re-upload all of your data to conform to it. This is definitely something to be avoided!

Data schemas

Set out below are the standard order and product Insight data schemas as commonly used by our integrations. These schemas are still extendable, however. Certain attributes are mandatory for using our product recommendations feature. These are indicated in bold.

Order Insight data schema

The order collection is used to provide sales information on a per contact basis. This information is used to power our product recommendations and ecommerce analytics and persona rating.

📘
This is a contact scoped collection

Collection name and type

When creating the collection using the create Insight data collection call the order collection must be named orders and the type must be orders

Fields

🚧
You must provide the required fields and correct collection name
Bold fields indicate that they are required for our product recommendations feature and other commerce analytics.
If you include any of the other fields listed in the table (that aren't required to use our commerce features), you must match the data type listed in the table to avoid disrupting the commerce features.

Field		Type	Required	Notes
id		string	no
order_total		numerical	yes
payment		string	no
delivery_method		string	no
delivery_total		numerical	no
currency		string	yes	The currency attribute must be one that is supported. You can find a list of supported currency codes and how to format them in the Currency conversion article.
order_status		string	no
email		string	no
quote_id		string	no
purchase_date		date	yes
billing_address		object	no
	billing_address_1	string	no
	billing_address_2	string	no
	billing_city	string	no
	billing_country	string	no
	billing_postcode	string	no
delivery_address		object	no
	delivery_address_1	string	no
	delivery_address_2	string	no
	delivery_city	string	no
	delivery_country	string	no
	delivery_postcode	string	no
products		array	yes
	name	string	yes
	price	numerical	yes	This price should correspond to the item's unit price
	sku	string	yes
	qty	numerical	yes
order_subtotal		numerical	yes
base_subtotal_incl_tax		numerical	no
discount_amount		numerical	no	The discount amount that's deducted from the order total. This should be a positive value.
couponCode		string	no

Wishlist Insight data schema

The Wishlist collection is used to provide the ability to store product wishlists for contacts.

📘
This is a contact scoped collection

Collection name and type

When creating the collection using the create Insight data collection call the order collection must be named wishlist and the type must be wishList

Fields

🚧
You must provide the required fields and correct collection name
Bold fields indicate that they are required

Field		Type	Required	Notes
id		string	yes	Unique id for the wishlist.
customerId		string	yes	The customer id in the ecommerce platform
email		string	yes	The email address of the customer in the ecommerce platform.
updatedAt		string	yes	Last time the wishlist was updated in UTC ISO8601 format
totalWishListValue		numerical	yes	The combined value of all the items on the wishlist.
items		array	yes
	sku	string	yes	The SKU for the product item.
	name	string	yes	Product item display name.
	qty	numerical	yes	Quantity of the items desired.
	price	numerical	yes	This price should correspond to the item's unit price.
	totalValueOfProduct	numerical	yes	price times qty.

Cart Insight data schema

The cart collection is used to provide the information needed to support our abandoned cart features.

📘
This is a contact scoped collection

Collection name and type

When creating the collection using the create Insight data collection call the order collection must be named cartinsight and the type must be cartInsight

Fields

🚧
You must provide the required fields and correct collection name
Bold fields indicate that they are required for our abandoned cart features to operate correctly

Field		Type	Required	Notes
cartid		string	yes

cartphase		string	yes
currency		string	yes	The currency attribute must be one that is supported. You can find a list of supported currency codes and how to format them in the Currency conversion article.
subtotal		numerical	yes
taxamount		numerical	yes	Should be 0 is taxes aren’t available or calculated yet.
grandtotal		numerical	yes	The final total of the order, including tax, discounts, etc., if known
discountAmount		numerical	no	The amount discounted on the entire order. This should be a positive value.
carturl		string	yes
createdat		date time	yes	UTC in ISO8601 format
modifieddate		date time	yes	UTC in ISO8601 format
lineitems		array	yes
	sku	string	yes
	name	string	yes
	quantity	numerical	yes
	unitprice	numerical	yes	This price should correspond to the item's unit price
	totalprice	numerical	yes
	imageurl	string	yes
	producturl	string	yes
programID		numerical	no	System fields used by Dotdigital native abandoned cart engine
cartDelay		string	no	System fields used by Dotdigital native abandoned cart engine

Product Insight data schema

📘
This is an account scoped collection

The product catalog collection can be used to provide Dotdigital with one or more product catalogs that can be correlated with any order data and used to provide product recommendations.

Collection name and type

When creating the collection using the create Insight data collection call the product catalog collections must be named with a prefix of catalog_ (e.g. catalog_snowAndFun) and the type must be catalog

Fields

🚧
You must provide the required fields and correct collection name
Bold fields indicate that they are required for our product recommendations feature and other commerce analytics.

Field		Type	Required	Notes
id		string	yes	A unique id for this product
parent_id		string	no	This field is used to indicate whether the product is linked to a parent (or configurable) product. The value will be the ID of that parent product. If the product is standalone or is itself a parent product, the value can be blank or remain as the parent ID respectively.
name		string	yes
price		numerical	yes
specialPrice		numerical	no
price_incl_tax		numerical	no
specialPrice_incl_tax		numerical	no
url		string	yes
sku		string	yes
created_date		date time	no	This field is required to use New in store product recommendations. If you do not provide a value, this field is automatically populated with a default date value of 01/01/0001.
stock		numerical	no
type		string	no	This field is used to define the kind of product (in terms of hierarchy). Some possible values include 'Configurable', 'Bundle', 'Variant' or 'Virtual'. The possible values will be dictated by your ecommerce platform.
status		string	no
image_path		string	yes
categories		array	no	The categories field provides a way to link a product to a category or a set of categories. Using the categories field requires to synchronize a separate categories__ insight collection, as described further below. The categories field must be an array of objects, including an id field as a string value. These id(s) provided need(s) to match category ids available in the categoriescollection.
	id	object	yes	Example: [ { “id” : ”123” }, { “id”: “456” } ]

Categories Insight data schema

📘
This is an account scoped collection

Categories insight collections can be used to store the categories of your products, as defined in your ecommerce store or ERP system. It allows you to segment contacts based on the product categories they purchase, or apply filters to your catalog or product recommendations.

Collection name and type

When creating the collection using the create Insight data collection call the product categories collection must be named with a prefix of categories_ and have a suffix matching the product catalog it relates to; the type must be productCategories

For example:

For the catalog SnowYo_Retail you should create categories_SnowYo_Retail

Fields

🚧
You must provide the required fields and correct collection name
Bold fields indicate that they are required for our product recommendations feature and other commerce analytics.

Field	Type	Required	Notes
id	string	yes
name	string	yes
parent_id	string	no	parent_id can be used to create parent-child hierachry between categories. For example, if 'Clothing' has the 'id':'123', 'Dresses' ('id':'456') could have 'parent_id' set to '123'. If a category has not got any defined parent, you can use 'parent_id':'0'.

What is Insight data?

So they’re additional contact data fields, extending the contact data fields we already store?

Dataset examples

What can I do with this data?

How is Insight data stored?

Record Ids (keys) are used to refer to individual pieces of data

Restrictions on Record Ids (keys)

🚧Keys must be unique across all contacts

Data is stored against a contact or as account-scoped data

JSON data representation

Restrictions on collection names

Restrictions and extensions on values

📘A note on date time formatting

JSON stores data in schema-bound collections

The schema of a collection is defined ‘on the fly’

Types are immutable

Objects are extendable

📘Get the schema for an Insight data collection

Bulk import Insight data

Import using the API

🚧Ensure contacts already exist to successfully bulk add transactional data to them

Import using the app

🚧Plan ahead

Data schemas

Order Insight data schema

📘This is a contact scoped collection

Collection name and type

Fields

🚧You must provide the required fields and correct collection name

Wishlist Insight data schema

📘This is a contact scoped collection

Collection name and type

Fields

🚧You must provide the required fields and correct collection name

Cart Insight data schema

📘This is a contact scoped collection

Collection name and type

Fields

🚧You must provide the required fields and correct collection name

Product Insight data schema

📘This is an account scoped collection

Collection name and type

Fields

🚧You must provide the required fields and correct collection name

Categories Insight data schema

📘This is an account scoped collection

Collection name and type

Fields

🚧You must provide the required fields and correct collection name

🚧
Keys must be unique across all contacts

📘
A note on date time formatting

📘
Get the schema for an Insight data collection

🚧
Ensure contacts already exist to successfully bulk add transactional data to them

🚧
Plan ahead

📘
This is a contact scoped collection

🚧
You must provide the required fields and correct collection name

📘
This is a contact scoped collection

🚧
You must provide the required fields and correct collection name

📘
This is a contact scoped collection

🚧
You must provide the required fields and correct collection name

📘
This is an account scoped collection

🚧
You must provide the required fields and correct collection name

📘
This is an account scoped collection

🚧
You must provide the required fields and correct collection name