Optimize SQL Data Explorer queries
Improve your SQL Data Explorer queries to optimize performance and reduce costs.
Read time 1 minuteLast updated 19 hours ago
As your game becomes increasingly successful and your player base grows, you might find that even simple SQL Data Explorer queries can take significant time to run. See the following basic query:
SELECT count(distinct event_name, user_id)/count(distinct user_id) as "avg number of event types per user"FROM EVENTSWHERE event_date>current_date-7
Trade query accuracy for speed by using (mod)hash
With a large dataset, this might take a while to run. However, sinceuser_idHere, we’re splitting our users into 100 pseudo-randomly assigned and numbered buckets, looking at bucket number 63. As we increase computational complexity, filtering data in this way becomes more useful, saving time.SELECT count(distinct event_name, user_id)/count(distinct user_id) as "avg number of event types per user"FROM EVENTSWHERE event_date>current_date-7AND mod(hash(user_id),100) = 63
Trade query accuracy for speed by rounding
A simple way to improve query speed without 100% accuracy is to useapproximate_count_distinctSELECT approximate_count_distinct(event_name, user_id)/ approximate_count_distinct(user_id) as "avg number of event types per user"FROM EVENTSWHERE mod(hash(user_id),100) = 63
Use smaller objects
We can improve many queries by using smaller objects. Look at the SQL Data Explorer tables page to see what tables are available to query. These are available out-of-the-box with UGS:EVENTS- : holds a single row per player alongside their lifetime metrics in the game.
USERS - : includes data on each session for each player.
FACT_USER_SESSIONS_DAY - : consists of a row for each event that a player has sent each day, along with a total count.
FACT_EVENT_TYPE_USERS_DAY - and
FACT_WAU_USERS: include profile data for users who have played within the previous week or month on a given day.FACT_MAU_USERS
FACT_EVENT_TYPE_USERS_DAYFACT_USER_SESSIONS_DAYFACT_EVENT_TYPE_USERS_DAYNUMBER_OF_EVENTSSELECT event_date, round((SUM(case when event_name='missionFailed' then number_of_events else null end) / SUM(case when event_name='missionStarted' then number_of_events else null end))*100) as overallFailRateFROM FACT_EVENT_TYPE_USERS_DAYWHERE event_name in ('missionStarted','missionFailed')AND event_date>current_date-30GROUP BY event_dateORDER BY event_date