Marketing Analytics Data Overview and General Information
Marketing Analytics, as the embedded data team for the Marketing Org, relies heavily upon and adheres closely to the guidelines, rules, and best practices of the Central Data Team (CDT). ßBelow are a few brief overviews of core concepts involved in the data work performed by the Marketing Analytics team - for further details and more granular information, please review the CDT/Enterprise Data Team handbook pages.
Data Modeling Best Practices
As one of the core principles of the CDT’s data modeling practices, Marketing Analytics follows the guidelines for creating Trusted Data wherever and whenever possible. Essentially, Trusted Data (TD) means that the models, at every step of the way, adhere to all coding and style guides and the information/results have been fully vetted as accurate and reliable. For you, the consumer, it means that you know the numbers and results you are viewing are correct.
Data modeling is both an artform and a science. There are numerous methodologies and guides to perform this task, and CDT has provided the Marketing Analytics team (and all other embedded data teams) with a number of very helpful tools - ranging from the SQL Style Guide, which outlines the ideal formats and functions for all SQL code in the data warehouse; to the Kimball Methodology, the framework for the ways that the various levels of models should be constructed and fit together to form the final tables ingested into Tableau for analysis.
The Kimball Methodology outlines key “stages” of table creation along the path from the raw source data to the fully ingestible mart/report tables most heavily used in Tableau and other BI tools.
Raw/Source
We start at the source layer, which is the system housing the data that needs to be analyzed (i.e. Salesforce, Marketo, Iterable, etc.)
The raw fields for each object (i.e. Opportunity, Account, Contact, etc.) are ingested, formatted, and named clearly and accurately in the raw/source table.
Prep
The raw/source fields are next pulled into a prep layer. It is here that base-level calculations, transformations, and other logical modeling steps are performed on the raw data to prepare it for the next stage in its data journey.
The prep layer is often the most complex of the layers, as it is the core repository for the primary logic performed on the raw data.
Dim/Fct
The prep models are split into two distinct tables
Fct: This table has a single unique ID as well as numerous other joinable IDs for a given object/data set. It is here that dates, IDs, and measurements (the numbers involved in business decisions) are stored and prepared.
Dim: This table houses a single unique ID for a given object/data set and then all of the descriptive information of a single ID-row of a fct table.
Each ID in a fct will have a single Dim table, and each Dim table will only have a single ID within it.
Dims are joined together by joining to the fct table that they share in common.
Mart
The mart layer combines the dims and fct models together to become a complete whole.
Marts have very little logic or calculations within them as they serve, primarily, to present a queryable final table for a given object or business process.
Report (rpt)
Report tables are built off of marts (usually). They are filtered offshoots of the whole mart data set and serve as a queryable table for a refined business question or process.
Example: You may have a mart of all opportunities in Salesforce, while having a report/rpt of all closed won opportunities - pre-filteering the mart’s results provides a smaller and more agile data set when you know you only want a portion of the mart’s whole.
Tech Stack Overview
CDT’s and thus Marketing Analytics’ data pipeline tech stack is straightforward, efficient, and effective.
It starts with extracting the raw/source data from each source system, as shown in the CDT Sources table. The extraction happens with a variety of tools based on which tool best supports the specific source system. In Marketing we primarily rely on Stitch (SFDC) and FiveTran (Marketo and Iterable and others). The extraction tool grabs all specified columns for each synced table/object and creates a raw-level table of those fields.
These raw tables are extracted into Snowflake, the repo/hub of all data at GitLab.
The next layer is the data build tool (dbt), which is a transformation layer of the modeling process. In dbt, SQL and other code is written to further transform, refine, calculate, and concatenate various fields and logic sets to create the prep layer as mentioned above.
The finished tables that dbt creates are once again pushed to Snowflake, where the dbt-created tables will live, accessible in the CDT data catalog, otherwise referred to as the Data Warehouse.
The last step in the process is to ingest the finished dbt-created, snowflake-housed tables into our BI tool of choice, which is usually Tableau - where dashboards and charts are created from the various tables, to allow easy analyses.
Data Flow
Source System
Source Object
Source Table
Prep Table
Dim Table
Fct Table
Mart Table
SFDC
Account
sfdc_account_source
prep_crm_account
dim_crm_account
fct_crm_account
mart_crm_account
SFDC
Bizible Attribution Touchpoint
sfdc_bizible_attribution_touchpoint_source
prep_crm_attribution_touchpoint
dim_crm_touchpoint
fct_crm_attribution_touchpoint
mart_crm_attribution_touchpoint
SFDC
Bizible Person
sfdc_bizible_person_source
prep_crm_person
dim_crm_person
fct_crm_person
mart_crm_person
SFDC
Bizible Touchpoint
sfdc_bizible_touchpoint_source
prep_crm_touchpoint
dim_crm_touchpoint
fct_crm_touchpoint
mart_crm_touchpoint
SFDC
Campaign
sfdc_campaign_source
prep_campaign
dim_campaign
fct_campaign
N/A
SFDC
Campaign Member
sfdc_campaign_member_source
prep_campaign
dim_campaign
fct_campaign
N/A
SFDC
Contact
sfdc_contact_source
prep_crm_person
dim_crm_person
fct_crm_person
mart_crm_person
SFDC
Event
sfdc_event_source
prep_crm_event
dim_crm_event
fct_crm_event
mart_crm_event
SFDC
Lead
sfdc_lead_source
prep_crm_person
dim_crm_person
fct_crm_person
mart_crm_person
SFDC
Opportunity
sfdc_opportunity_source
prep_crm_opportunity
dim_crm_opportunity
fct_crm_opportunity
mart_crm_opportunity
SFDC
Task
sfdc_task_source
prep_crm_task
dim_crm_task
fct_crm_task
mart_crm_task
SFDC
Users
sfdc_users_source
prep_crm_user
dim_crm_user
N/A
N/A
Marketo
Activitiy-specific source tables
Marketo model source
prep_marketo_activity
dim_marketo_activity
fct_marketo_activity
mart_marketo_activity
Marketo
Lead/Person
marketo_lead_source
prep_marketo_person
N/A
N/A
mart_marketo_person
Iterable
Campaign History
iterable_campaign_history_source
prep_iterable_campaign
WIP
WIP
WIP
Iterable
Campaign List History
iterable_campaign_list_history_source
prep_iterable_campaign
WIP
WIP
WIP
Iterable
Campaign Metrics
iterable_campaign_metrics_source
prep_iterable_campaign_metrics
WIP
WIP
WIP
Iterable
Campaign Suppression List History
iterable_campaign_suppression_list_history_source
prep_iterable_campaign
WIP
WIP
WIP
Iterable
Channel
iterable_channel_source
prep_iterable_user
WIP
WIP
WIP
Iterable
Email Link Param History
iterable_email_link_param_history_source
prep_iterable_email_template
WIP
WIP
WIP
Iterable
Email Template History
iterable_email_template_history_source
prep_iterable_email_template
WIP
WIP
WIP
Iterable
Event
iterable_event_source
prep_iterable_event
WIP
WIP
WIP
Iterable
List
iterable_list_source
prep_iterable_list
WIP
WIP
WIP
Iterable
List User History
iterable_list_user_history_source
prep_iterable_list
WIP
WIP
WIP
Iterable
Message Type
iterable_message_type_source
prep_iterable_message_type
WIP
WIP
WIP
Iterable
Template History
iterable_template_history_source
prep_iterable_email_template
WIP
WIP
WIP
Iterable
User History
iterable_user_history_source
prep_iterable_user
WIP
WIP
WIP
Iterable
User Unsubscribed Channel
iterable_user_unsubscribed_channel_source
prep_iterable_user
WIP
WIP
WIP
Iterable
User Unsubscribed Message
iterable_user_unsubscribed_message_type_source
prep_iterable_message_type
WIP
WIP
WIP
Marketing Data Models
All GitLab data models are grouped and categorized in snowflake/dbt through a series of folders and hierarchy. Below are the primary Marketing data models, there folders, links, and descriptions. We’ve added another lvel of organization, the Grouping field, which is a custom grouping of the data models based on their usage and lineage. The list of groupings below, preceding the data model inventory table, will you give you a brief understanding of what each grouping represents.
Groupings
Grouping
Description
Event Performance
Models to calculate and analyze Event data streams.
Models built off the Lead to Revenue model. L2R is a consolidation of SFDC PErsons (leads/contacts), Accounts, Opportunities, and Bizible Touchpoints.
Marketo
Models built off of Marketo source data. These are often used in conjuction with the L2R and Presentation Layer Model groupings for additional fields and information.
Presentation Layer Model
Models used in active and public Tableau Dashboards. These models are the most commonly used and presented for analysis.
Models built off of the SFDC Source tables. These are often used in conjuction with the L2R and Presentation Layer Model groupings for additional fields and information.
Web
Models built to model out web traffic data (top of funnel).
Tableau-specific table that unions mart_crm_touchpoint with person, campaign data joined on and mart_crm_attribution_touchpoint with person, campaign and oppunitiy data.
Tableau-specific table that joins Zuora, Opportunity, and Lead To Revenue data to hone in on the community-owned opportunities and their related data points.
A refined subset of the rpt_lead_to_revenue, focused exclusively on prospect data. It includes records where touchpoint dates and pipeline creation dates fall within the rolling past 365-day window
A consolidation and combination of the major SFDC Objects: Lead, Contact, Account, Opportunity, and Touchpoints. This model supports full funnel analysis and cohorting of data.