Free ebook: Redshift Queries Playbook

The Redshift Queries Playbook SQL queries for understanding user behavior. What's in the playbook? The playbook provides SQL queries for answering questions like: What were the last 3 actions users did before converting Who are my top paying users Which users did event x within 24 hours of doing event Y

Table Of Contents

Table of contents 0. Schema description 1. Active Users Count the active users on a given day 2. New Users Count the new users on a given day 3. Composition Show the breakdown of devices for users in a two week period 4. Sessions Show the distribution of session lengths on a specific date Show the average session length per segment 5. Events Show the distribution of event property totals Count the number of users who did an event more than twice Count the number of events done by a specific set of users who did another event Show the distribution of users who have done an event by number of times done Find out the last three events a user does before churning 6. Funnels Obtain a list of users for each step of a funnel Adding steps to a funnel Getting the list of users who did (or did not) reach a step in a funnel Funnels where users did event X, then Y, with no other events in between Funnels where users did event Y after event X, within 24 hour hours of event X 7. Revenue Obtain the number of paying users and total revenue Obtain a list of top paying users 8. User Properties Obtain the most common values for a given user property Obtain a list of users who have certain properties Obtain the most common advertising referral networks for users Obtain the number of users whose current level is greater than 7 but less than 10 9. Event properties Obtain the list of items bought and how frequently that item was purchased Obtain the number of users who placed a bet and wagered between 100-500 credits Additional Resources

First 3 Pages

Redshift Queries Playbook SQL queries for understanding user behavior Updated June 23, 2015 This playbook shows you how you can use Amplitude's Amazon Redshift database to answer common questions about user behavior in your app. All queries are written in the PostgreSQL syntax used by Redshift and can be executed directly from the Redshift prompt. You can use Amazon’s Redshift documentation to help you understand the supported functions. Quick and Essential Tips 1. Data for respective apps will be kept in their own schemas (namespaces/packages in Redshift) By Default, every Redshift command you run will be under the public schema. However, you can select which schema you want to work under instead. You can do this by using the SET search_path command. SET search_path = app123; SELECT COUNT(*) FROM events; Or you can include the schema as a prefix to the table SELECT COUNT(*) FROM app123.events; 2. Query directly from each app's table instead of the entire events table when possible. The events from each of your Amplitude apps are stored in their own tables. The table name for each app is 'events###' where the ### is the app number, which you can find in the URL of the Amplitude dashboard. The union of each app's events appears in a table called 'events'. Selecting FROM events### when possible will make your queries faster and more efficient. 3. Custom event properties and custom user properties associated with an event_type will be pulled into their own columns in the respective event_type table . Custom user-properties will be appended by a ‘u_' and custom event- properties will be appended by a 'e_'. Note - There is a a limit of 400 user properties and 50 event properties that will be pulled into their own columns. Anything past the limit will still require the JSON_EXTRACT_PATH_TEXT Function.