SQL Data Explorer
Use SQL Data Explorer to write and execute read-only SQL queries on your data, plot the results into different types of visualizations, and add those to a dashboard. You can gain insights that might not be surfaced from other Analytics products. See our collection of 'recipes' in the SQL cookbook. Queries are run against Snowflake, so uses the Snowflake SQL dialect. Note that non-SELECT statements and $SYSTEM commands are limited in functionality.
You can toggle between a bar chart, line graph, area graph, pie chart, and stacked bar chart.
SQL is a programming language used to communicate with databases. A basic understanding of writing SQL queries is required to use SQL Data Explorer.
The following code shows the default SQL query when you use SQL Data Explorer (but you can provide your own):
select EVENT_DATE, count(distinct USER_ID) from EVENTS where EVENT_DATE > CURRENT_DATE-7 group by EVENT_DATE order by EVENT_DATE desc
This query counts the number of distinct users who have played your game in the last seven days.
After running a query, set up your chart. Currently, two Y axes and one X axis are supported. Choose your column and unit for each. To rename your axis label, use the as expression in your SQL query, which relabels the axis name.
Set up your chart for the X and Y axes.
Choose from three chart styles: line, bar, and area.
Switching graph style: bar chart.
If you run a query without any data ordering, it's possible that your charts won't be an accurate representation of your data as it's not sorted. A warning will be displayed in this instance and it's recommended to order your query through the use of an "order by" keyword.
Under your chart is a table of results, showing the columns and result of your query.
Table of results.
The glossary panel provides a list of parameters you can use in your query. Use the clipboard to copy parameters. The glossary terms are pulled from two sources: the database column names (such as EVENT_DATE) and custom parameters parsed from the event json (see Snowflake documentation on how to query JSON data). You can define custom parameters depending on the events you send. Some parameters are fields specific to Snowflake, which can change the syntax in the SQL.
Database column names are more performant for querying than the parameters parsed from JSON and should be preferred when possible.
If a parameter is represented by a column (such as USER_COUNTRY), the query uses it as:
If instead the parameter comes from the event itself (one named “missionName”, for example), it’s accessed via:
Use Event Manager to create custom events to use in queries as columns.
Use Reports to save your queries. Reports provides a list of all queries so you can reload and use them later.
Use the Environments selector to switch between Unity Environments.
In this example, we look at mission failure rates, making use of custom events that track players' engagement with missions, with the missonID parameter.
For this query, we'll be using the default EVENTS table. This table includes granular data for every event recorded in our game.
SELECT EVENT_JSON:missionID::INTEGER AS missionID , count(distinct case when event_name = 'missionStarted' then user_id else null end) AS starts , round( (count(distinct case when event_name = 'missionFailed' then user_id else null end) / starts)*100 ) AS "Players failed %" FROM EVENTS WHERE missionID <=10 -- Focus on the first few levels AND event_name in ('missionStarted','missionFailed') AND event_date>current_date-30 -- Use a date filter for efficiency GROUP BY missionID ORDER BY missionID
Note that we used a date filter
event_date>current_date-30 here to limit our query, otherwise it would run over the 365 days of data that's queryable by default in SQL Data Explorer.
Use the results to plot a graph:
Note that the Y axis is the name we defined: "Players failed %".
As your game becomes increasingly successful and your player base grows, you might find that simple queries can take significant time to run.
See the following basic query:
SELECT count(distinct event_name, user_id)/count(distinct user_id) as "avg number of event types per user" FROM EVENTS WHERE event_date>current_date-7
With a large dataset, this might take a while to run. A user_ids are stored as a hash, you can rewrite your code as follows:
SELECT count(distinct event_name, user_id)/count(distinct user_id) as "avg number of event types per user" FROM EVENTS WHERE event_date>current_date-7 AND mod(hash(user_id),100) = 63
Here, we’re splitting our users into 100 pseudo-randomly assigned and numbered buckets, looking at bucket number 63. As we increase computational complexity, filtering data in this way becomes more useful, saving time.
A simple way to improve query speed without 100% accuracy is to use
approximate_count_distinct. Our previous query would become:
SELECT approximate_count_distinct(event_name, user_id)/ approximate_count_distinct(user_id) as "avg number of event types per user" FROM EVENTS WHERE mod(hash(user_id),100) = 63
We can improve this query by using smaller objects.
Look at the Glossary to see what tables are available to query. These are available out-of-the-box with UGS:
- USERS: holds a single row per player alongside their lifetime metrics in the game.
- FACT_USER_SESSIONS_DAY: includes data on each session for each player.
- FACT_EVENT_TYPE_USERS_DAY: consists of a row for each event that a player has sent each day, along with a total count.
- FACT_WAU_USERS and FACT_MAU_USERS include profile data for users who have played within the previous week or month on a given day.
Between FACT_EVENT_TYPE_USERS_DAY and FACT_USER_SESSIONS_DAY, you can probably answer 80%+ of most queries on smaller objects.
For example, in our first query, we were looking at mission fail rates. We could also use the FACT_EVENT_TYPE_USERS_DAY to calculate overall failure rates each day, with the NUMBER_OF_EVENTS count stored in this table.
SELECT event_date , round((SUM(case when event_name='missionFailed' then number_of_events else null end) / SUM(case when event_name='missionStarted' then number_of_events else null end))*100) as overallFailRate FROM FACT_EVENT_TYPE_USERS_DAY WHERE event_name in ('missionStarted','missionFailed') AND event_date>current_date-30 GROUP BY event_date ORDER BY event_date
Use the new wizard to pivot, edit, and unpivot data:
Use this query to see the event stream for players that meet specific criteria. It’s useful for QA and debugging as, using the USERS table mentioned above, you’ll get a different user every time you run it.
If, for example, you suspect that events aren’t being recorded correctly for players who installed a certain version of your game, you can run the following:
SELECT event_timestamp , event_name , EVENT_JSON:sessionID::STRING FROM EVENTS WHERE client_version LIKE '0.0%' AND user_id IN ( SELECT user_id FROM USERS WHERE start_date BETWEEN '2022-07-01' AND '2022-07-08' LIMIT 1 —- pull out just one user ) ORDER BY event_timestamp
SQL Data Explorer queryable tables
Through the SQL Data Explorer you have access to a number of tables.
This table lists all of the events that have been sent in from your application. Event-specific parameters can be found in the EVENT_JSON column as a JSON object. As the parameters are stored as a JSON object, you'll need to parse the content in order to query it. Examples for parsing your parameters can be seen in the samples included in this repository. You can also retrieve the syntax to parse the parameters by typing in the parameter name and pressing TAB to auto-complete the syntax. This table is updated every 1-2 hours.
This table lists all the users who have sent in an event in the past. For each user, the columns contain a variety of useful metrics. Under the metrics column, specifically, there are metrics which are calculated using SQL Analytics Functions. This table is updated every 3-4 hours.
This table has a record for each user session. Each record contains a variety of user level aggregate KPIs for that session. If any of the dimensions excluding aggregate dimensions, are recorded in this table change during a session (for example, AGE_GROUP or GENDER), a new record will be created for that session. This table is updated every 1-2 hours.
This table lists all of the events a user has recorded in one day and how many times they have sent in each of those events. This table is updated every 1-2 hours.
This table lists all of the users that have sent in an event in the last seven days. This table is updated every 1-2 hours.
This table lists all of the users that have sent in an event in the last 30 days. This table is updated every 1-2 hours.
For examples of SQL queries that can be written to access data via the above tables, see the SQL Data Explorer Cookbook.