Getting started: Segments
A user segment is a collection of customers defined by some combination of behavioral conditions. For example, we may be interested in analyzing a segment of users having completed ArticleRead followed by PhotoUpload.
Let’s begin by writing a query for all users completed both events in the past 30 days, along with when they completed it:
set app_id=12345;
set lookback_period_in_days=-30;
SELECT localytics_device_id,
event_name,
client_time
FROM fact_events
WHERE app_id = $app_id
AND client_date <= dateadd(day ,$lookback_period_in_days, current_date())
AND (event_name = 'ArticleRead' OR event_name = 'PhotoUpload')
This will give:
| localytics_device_id | event_name | client_time |
|---|---|---|
| user2 | PhotoUpload | Sat, 11 Jan 2017 11:30:04 +0000 |
| user1 | ArticleRead | Sat, 14 Jan 2017 09:30:21 +0000 |
| user2 | ArticleRead | Sat, 15 Jan 2017 12:32:14 +0000 |
| user1 | PhotoUpload | Sat, 18 Jan 2017 20:30:12 +0000 |
| … | … |
Note that user1 completes PhotoUpload after ArticleRead, according to client_time. Let’s modify change our query to only retrieve localytics_device_id's:
set app_id=12345;
set lookback_period_in_days=-30;
SELECT localytics_device_id
FROM fact_events
WHERE app_id = $app_id
AND client_date <= dateadd(day ,$lookback_period_in_days, current_date())
AND (event_name = 'ArticleRead' OR event_name = 'PhotoUpload')
GROUP BY localytics_device_id
This query only gives us users who completed both events, but we’re only interested in users who did them in a particular order. That is, for each user, we want the timestamp for the latest PhotoUpload to be greater than the first time the user did ArticleRead. We can retrieve these timestamps using MAX and MIN functions:
set app_id=12345;
set lookback_period_in_days=-30;
SELECT localytics_device_id, MAX(
CASE
WHEN
event_name = 'PhotoUpload' THEN client_time
END
) as maximum_photo_upload_time,
MIN
(CASE
WHEN
event_name = 'ArticleRead' THEN client_time
END
) as minimum_article_read_time
FROM fact_events
WHERE app_id = $app_id
AND client_date <= dateadd(day ,$lookback_period_in_days, current_date())
AND (event_name = 'ArticleRead' OR event_name = 'PhotoUpload')
GROUP BY localytics_device_id
This gives:
| localytics_device_id | minimum_article_read_time | maximum_photo_upload_time |
|---|---|---|
| user1 | Sat, 14 Jan 2017 09:30:21 +0000 | Sat, 18 Jan 2017 20:30:12 +0000 |
| user2 | Sat, 15 Jan 2017 12:32:14 +0000 | Sat, 11 Jan 2017 11:30:04 +0000 |
| … | … |
Finally, we just need to filter the aggregated view only users who completed PhotoUpload after ArticleRead. All we need to is add to the HAVING clause
set app_id=12345;
set lookback_period_in_days=-30;
SELECT localytics_device_id, MAX(
CASE
WHEN
event_name = 'PhotoUpload' THEN client_time
END
) as maximum_photo_upload_time,
MIN
(CASE
WHEN
event_name = 'ArticleRead' THEN client_time
END
) as minimum_article_read_time
FROM fact_events
WHERE app_id = $app_id
AND client_date <= dateadd(day ,$lookback_period_in_days, current_date())
AND (event_name = 'ArticleRead' OR event_name = 'PhotoUpload')
GROUP BY localytics_device_id
HAVING maximum_photo_upload_time > minimum_article_read_time
Adding additional conditions
Adding an additional condition to your segment is easy. In our original example, maybe you only want to consider ArticleRead events wherecountry=us. You can add this condition as follows:
set app_id=12345;
set lookback_period_in_days=-30;
SELECT localytics_device_id, MAX(
CASE
WHEN
event_name = 'PhotoUpload' THEN client_time
END
) as maximum_photo_upload_time,
MIN
(CASE
WHEN
(event_name = 'ArticleRead' AND country = 'us') THEN client_time
END
) as minimum_article_read_time
FROM fact_events
WHERE app_id = 40957
AND client_date <= Dateadd(day ,-30, current_date())
AND (event_name = 'ArticleRead' OR (event_name = 'PhotoUpload' and country='us'))
GROUP BY localytics_device_id
HAVING maximum_photo_upload_time > minimum_article_read_time
Segment Template
Here is a general template to create user segments for a users who completed Event1 followed by Event2 with some conditions/filters <conditions>:
set app_id=12345;
set lookback_period_in_days=-30;
SELECT localytics_device_id,
FROM fact_events
WHERE app_id = $app_id
AND client_date <= dateadd(day ,$lookback_period_in_days, current_date())
AND ((event_name = 'Event1' AND <conditions>) OR (event_name = 'Event2' AND <conditions>))
GROUP BY localytics_device_id
HAVING (
CASE
WHEN
MAX(
CASE
WHEN
(event_name = 'Event2' AND <conditions>) THEN client_time
END
) > MIN(
CASE
WHEN
(event_name = 'Event1' AND <conditions>) THEN client_time
END
) THEN 1
ELSE 0
END
) = 1 -- Change to "0" if you want to answer "Which users complete Event1 and then never complete Event2?"