Uncategorized

Salesforce Marketing Cloud — Data Extraction

There is normally two main ways to extract data from Salesforce Marketing Cloud in a programmatic manner. Either rely on its’ built in automation studio framework or rely on its’ SOAP API to query some of its’ datasets.

Automation

Data Extracts

Data extracts are setup within Automation Studio, and can be used to exports certain events such key activity metrics such as Bounces, Clicks, Conversion etc

Setting up a data extract is based on two parts 1) A schedule 2) A data extract Activity

Schedule

To setup the schedule, just drag and drop the “schedule” icon onto the starting source column and press configure.

Once you press configure, you will be prompted with the screen shown above, that lets you setup the starting time for the automation and the frequency at which to run it.

Data Extract Activity

Prompted with the below screen, create a new data extract

The properties need to be configured, with name, file naming pattern and extract types.

The data extract, needs to be setup as a tracking extract in order to allow for the extraction of the key entities. Additional configuration is required:

Overall

Once ready, the automation should look like the one shown above. Once this is setup, it will be necessary to create a file transfer activity as next step to transfer the file out to a FTP, based on the file pattern chosen during the data extract configuration.

Queries

Salesforce Marketing cloud allows to query the data available within its’ data extensions and some of its’ core data, through data views, using SQL queries.

Sample fields available in the _Open data view

This allows the extraction of core metrics, such as sends, clicks, bounce … and the merging of that information with additional data sources such as a customer classification table.

The queries can be defined within the automation studio and need to be in a syntax similar to Ms SQL Server 2005. The query will generate an output to a specific data extension, that needs to be created before the query runs. Marketing Cloud offers different options with respect to handling the data already present in the data extension: Append, Replace or Overwrite.

Once the query has run, the data will there before be stored in a data extension within Salesforce Marketing Cloud. There will still be the need to extract that specific dataset. For this a data extract and a transfer file activity will need to be set up ass additional steps within the automation.

In a similar manner to the normal data extracts, we need to configure the data extract activity, in this case however, it will be necessary to select the extract type as a “Data Extension Extract”. This will output the dataset to SFMC secure location. The file transfer activity that needs to follow will transfer the file from this specific location to your desired FTP location.

API

FuelSDK is the official SDK for Salesforce Marketing Cloud, with it it possible to query the different data extensions available and push or retrieve data. From my experience, pulling small amount of data from the API tends to be fine, but for large amount of information but for larger datasets, it is preferable to rely on the FTP integration.

High-level API call

At a high level, the following piece of code would pull the rows from a given

row = FuelSDK.ET_DataExtension_Row() row.auth_stub = stubObj row.Name = data_extension_name row.props = props row.search_filter = data_extension_filter getResponse = row.get()

In order to setup the API call we need the following four parameters

  • An Auth Stub object
  • A Data Extension Name
  • Props
  • A search filter

While the Data Extension’s name is pretty straight forward to get the other parameters needs to explained in a little bit more details.

Auth Stub

The auth Stub can be setup through the ET_Client object from FuelSDK.

import FuelSDK stubObj = FuelSDK.ET_Client( False, False, { ‘clientid’: ‘‘, ‘clientsecret’: ‘‘, ‘defaultwsdl’: ‘https://.soap.marketingcloudapis.com/etframework.wsdl’, ‘authenticationurl’: ‘https://.auth.marketingcloudapis.com’, ‘baseapiurl’: ‘https://.rest.marketingcloudapis.com’, ‘soapendpoint’: ‘https://.soap.marketingcloudapis.com’, ‘useOAuth2Authentication’: ‘True’, ‘applicationType’: ‘server’ })

Props

When retrieving data the props object refers to the different columns that are intended to be retrieved. The props, in this case, is a list of string variables:

props = ["SubscriberKey", "email", "updated_date"]

Filters

SimpleOperators

updated_from = { ‘Property’ : ‘update_date, ‘SimpleOperator’ : ‘greaterThan’, ‘DateValue’ : start_date }

Simple operators allow for comparison between the value contained in a column (Property) and some constant (Value or DateValue). They accept a range of comparison: equal, greaterThan, lessThan … The full list of SimpleOperators provided by Marketing cloud is available here.

ComplexOperator

updated_from = { ‘Property’ : ‘update_date’, ‘SimpleOperator’ : ‘greaterThan’, ‘DateValue’ : start_date } updated_to = { ‘Property’ : ‘update_date’, ‘SimpleOperator’ : ‘lessThan’, ‘DateValue’ : end_date } updated = { ‘LeftOperand’ : updated_from, ‘LogicalOperator’: ‘AND’, ‘RightOperand’ : updated_to }

The ComplexOperator ie: LogicalOperator, allows to combine different search filters together. Supported Logical Operator are OR and AND.

There is currently an open issue for the Python SDK, with respect to allowing the nesting of search queries using this operators.

Response processing

The response object has two particular properties of interest getResponse.results and getResponse.more_results.

The results properties will return the list of rows fetched within the specific query page, while the more_results properties will return a boolean to indicate whether there are additional records to be fetched.

The following is an output from a results call, a List of DataExtensionObject, each representing a row of information:

An example query output

Within the DataExtensionObject, the column names and values are including in a Properties well … object property, with nested another Property property containing a list of APIProperty. If you got lost, and are only looking to extract that information as a dictionary the following piece of code should help:

output = [] for i in range(0, len(results)): x = results[i]

row.getMoreResults() allows in turn to get the extra pages of the results from Marketing cloud and returns [] when there is no other pages.

Leave a Reply

Your email address will not be published. Required fields are marked *