Prepare event data
Before creating an Event Stitching project, make sure your event models contain the fields DinMo needs to process events safely and explain the resulting event profile graph.
Event Stitching works best when each input model is an event table: one row represents one event that happened at a specific time.
What you need
Event models
Tables or models where each row is an event, such as page views, sessions, app events, purchases, or conversions.
Model primary key
Stable identifier for one event row. DinMo uses the model primary key for idempotency and audit.
Timestamp field
Standard timestamp field on the model. DinMo uses it to order observations and evaluate stitching lifetime.
Event partition column
Date or timestamp column used to select complete event windows efficiently.
Event identifiers
Columns that carry identity evidence, such as user ID, email, anonymous ID, cookie ID, device ID, session ID, click ID, or IP address.
Output permissions
Permission to create or replace Event Stitching output tables in the configured output dataset or schema.
Good input models
Good Event Stitching inputs are behavioral tables:
web events
app events
product usage events
sessions
purchases
conversions
support interactions
campaign interactions
Avoid profile-like tables such as contacts, accounts, users, leads, subscribers, and customers. Event Stitching expects event-grain models.
Model primary key
The model primary key must identify one event row inside a source model.
Good primary keys are:
stable across reruns
non-null
unique inside the event model
not derived from mutable fields
If two selected models can produce the same primary key value, DinMo still keeps them separate by source model.
Use this query to check duplicate primary keys in one model:
Timestamp field
The model timestamp field should represent when the event happened, not when the row was loaded into the warehouse.
Check for:
null timestamps
future timestamps
timestamps far outside the expected backfill range
inconsistent timezone handling
Event partition column
The event partition column is the field DinMo uses to select bounded windows.
Use a column that:
is present on every selected event model
is a date or timestamp field
follows the same calendar as the event timestamp
lets DinMo select complete windows, usually days
matches the physical partitioning or clustering strategy of the source table when possible
The best default is usually the event timestamp itself, or a derived event date that is physically partitioned in the warehouse.
Check daily volume before creating the project:
If many days have zero events, the project can still run, but the schedule and backfill window should match the real event cadence.
Identifier fields
Identifiers are the values DinMo can use to connect events.
Common identifiers:
User ID, customer ID, account ID
Strong
Good anchor candidates when they are authenticated and stable.
Email, email hash, phone
Strong to medium
Good when standardized and governed.
Anonymous ID, cookie ID, device ID, client ID
Medium to weak
Useful for pre-login behavior; protect with lifetime and profile-per-value limits.
Session ID
Weak
Use for short windows only.
Click IDs
Weak
Useful for attribution windows; avoid long lifetimes.
IP address
Weakest
Use only with strict policy or for audit.
Do not map campaign fields, page URLs, product names, country, channel, utm_* fields, or free-text values as identifiers unless they are intentionally used as identity evidence.
Check identifier coverage before setup:
Placeholder and polluted values
Bad values should be blocked before the first production run.
Common examples:
unknownundefinednullnonetest00000000-0000-0000-0000-000000000000empty strings after standardization
Find common low-quality values:
Add placeholder values to blocked values in the Identifier policy.
Shared and corrupted weak identifiers
Weak identifiers can create unsafe connections when one value appears across many strong identities.
Before trusting a weak identifier, check whether it behaves like a shared or corrupted value:
If this query returns many high-count values, configure a lower Max profiles per value, shorten the stitching lifetime, or block known bad values.
Multiple event models
An Event Stitching project can process several event models from the same source.
For each selected model:
choose an event partition column
map available physical fields to logical identifiers
use the same logical identifier when two models carry the same type of value
leave unrelated fields unmapped
Example:
web_events
user_id
User ID
web_events
anonymous_id
Anonymous ID
conversions
customer_id
User ID
conversions
email_hash
Email hash
This lets DinMo evaluate events from several models as one event profile graph.
Readiness checklist
Before creating the project, confirm:
selected models are event-grain tables
each model has a stable primary key
model timestamp fields are populated and plausible
each model has a date or timestamp event partition column
the main identifiers have meaningful coverage
weak identifiers do not obviously create large shared clusters
placeholder values are known and can be blocked
source tables are partitioned or clustered in a way that supports efficient window scans
the output dataset or schema can be written by DinMo
Then continue with Create an Event Stitching project.
Last updated

