Documentation

Support

Analytics

Analytics

Optimize SQL Data Explorer queries

Improve your SQL Data Explorer queries to optimize performance and reduce costs.
Read time 1 minuteLast updated 19 hours ago

As your game becomes increasingly successful and your player base grows, you might find that even simple SQL Data Explorer queries can take significant time to run. See the following basic query:
SELECT count(distinct event_name, user_id)/count(distinct user_id) as "avg number of event types per user"FROM EVENTSWHERE event_date>current_date-7

Trade query accuracy for speed by using (mod)hash

With a large dataset, this might take a while to run. However, since
user_id
s are stored as a hash, you can rewrite your code as follows:
SELECT count(distinct event_name, user_id)/count(distinct user_id) as "avg number of event types per user"FROM EVENTSWHERE event_date>current_date-7AND mod(hash(user_id),100) = 63
Here, we’re splitting our users into 100 pseudo-randomly assigned and numbered buckets, looking at bucket number 63. As we increase computational complexity, filtering data in this way becomes more useful, saving time.

Trade query accuracy for speed by rounding

A simple way to improve query speed without 100% accuracy is to use
approximate_count_distinct
. Our previous query would become:
SELECT approximate_count_distinct(event_name, user_id)/ approximate_count_distinct(user_id) as "avg number of event types per user"FROM EVENTSWHERE mod(hash(user_id),100) = 63

Use smaller objects

We can improve many queries by using smaller objects. Look at the SQL Data Explorer tables page to see what tables are available to query. These are available out-of-the-box with UGS:
  • EVENTS
  • USERS
    : holds a single row per player alongside their lifetime metrics in the game.
  • FACT_USER_SESSIONS_DAY
    : includes data on each session for each player.
  • FACT_EVENT_TYPE_USERS_DAY
    : consists of a row for each event that a player has sent each day, along with a total count.
  • FACT_WAU_USERS
    and
    FACT_MAU_USERS
    : include profile data for users who have played within the previous week or month on a given day.
Between
FACT_EVENT_TYPE_USERS_DAY
and
FACT_USER_SESSIONS_DAY
, you can probably answer 80%+ of most queries on smaller objects.
For example, to track mission fail rates, we could use the
FACT_EVENT_TYPE_USERS_DAY
to calculate overall failure rates each day, with the
NUMBER_OF_EVENTS
count stored in this table.
SELECT event_date, round((SUM(case when event_name='missionFailed' then number_of_events else null end) / SUM(case when event_name='missionStarted' then number_of_events else null end))*100) as overallFailRateFROM FACT_EVENT_TYPE_USERS_DAYWHERE event_name in ('missionStarted','missionFailed')AND event_date>current_date-30GROUP BY event_dateORDER BY event_date