The Redshift Queries Playbook
SQL queries for understanding user behavior.
What's in the playbook?
The playbook provides SQL queries for answering questions like:
What were the last 3 actions users did before converting
Who are my top paying users
Which users did event x within 24 hours of doing event Y
Table Of Contents
Table of contents
0. Schema description
1. Active Users
Count the active users on a given day
2. New Users
Count the new users on a given day
Show the breakdown of devices for users in a two week period
Show the distribution of session lengths on a specific date
Show the average session length per segment
Show the distribution of event property totals
Count the number of users who did an event more than twice
Count the number of events done by a specific set of users who did another event
Show the distribution of users who have done an event by number of times done
Find out the last three events a user does before churning
Obtain a list of users for each step of a funnel
Adding steps to a funnel
Getting the list of users who did (or did not) reach a step in a funnel
Funnels where users did event X, then Y, with no other events in between
Funnels where users did event Y after event X, within 24 hour hours of event X
Obtain the number of paying users and total revenue
Obtain a list of top paying users
8. User Properties
Obtain the most common values for a given user property
Obtain a list of users who have certain properties
Obtain the most common advertising referral networks for users
Obtain the number of users whose current level is greater than 7 but less than 10
9. Event properties
Obtain the list of items bought and how frequently that item was purchased
Obtain the number of users who placed a bet and wagered between 100-500 credits
First 3 Pages
SQL queries for understanding user behavior
Updated June 23, 2015
This playbook shows you how you can use Amplitude's Amazon Redshift database to
answer common questions about user behavior in your app.
All queries are written in the PostgreSQL syntax used by Redshift and can be executed directly from the Redshift prompt. You can use Amazon’s Redshift documentation to
help you understand the supported functions.
Quick and Essential Tips
1. Data for respective apps will be kept in their own schemas (namespaces/packages in Redshift)
By Default, every Redshift command you run will be under the public
schema. However, you can select which schema you want to work under
instead. You can do this by using the SET search_path command.
SET search_path = app123;
SELECT COUNT(*) FROM events;
Or you can include the schema as a prefix to the table
SELECT COUNT(*) FROM app123.events;
2. Query directly from each app's table instead of the entire events table when possible.
The events from each of your Amplitude apps are stored in their own tables. The table name for each app is 'events###' where the ### is the app number, which you can find in the URL of the Amplitude dashboard. The union of each app's events appears in a table called 'events'. Selecting FROM events### when possible will make your queries faster and more efficient.
3. Custom event properties and custom user properties associated
with an event_type will be pulled into their own columns in the
respective event_type table .
Custom user-properties will be appended by a ‘u_' and custom event-
properties will be appended by a 'e_'.
Note - There is a a limit of 400 user properties and 50 event properties
that will be pulled into their own columns. Anything past the limit will still
require the JSON_EXTRACT_PATH_TEXT Function.