QoS protocol
Documentation for QoS protocol
Read time 10 minutesLast updated a day ago
test123 Multiplay Hosting provides a quality of service (QoS) protocol to dynamically determine the available regions in which a client would expect to get the best connection quality for their online session. This document provides an overview of the QoS protocol components, how to use each component, and the best practices to consider when writing a custom implementation of the client. The QoS protocol is comprised of the following main components:
- The Discovery service allows the client to determine at runtime the regions that are currently active to test for connection quality.
- The QoS server allows the client to test for connection quality to each of the available regions.
Discovery service
The Discovery service provides a way for the client to determine at runtime the regions that are currently available to host or join an online session. It performs this action by providing a REST endpoint to query for currently available QoS servers for a given Multiplay Hosting fleet. The request endpoint supports a GET request at the following URL:Environment | URL |
---|---|
Production |
|
[fleet_id]
Response types
The possible HTTP response types from the request are listed in the following sections.200 OK
The request was successful, but the generated ETag did not match the value in the If-None-Match header.- Response Content Type: application/JSON
- Response Body: List of all QoS servers in fleet regions associated with the provided fleet in a JSON array called “servers”
Property | Type | Value |
---|---|---|
location_id | Integer | Multiplay Hosting location ID |
region_id | String | Multiplay Hosting region ID |
ipv4 | String | QoS server IPv4 address in dotted-quad format (empty string if not present) |
ipv6 | String | QoS server IPv6 address in RFC 4291 colon-delimited format (empty string if not present) |
port | Integer | Port that the IPv4 and/or IPv6 server is listening to for requests. |
{ "servers": [ { "location_id": 123, "region_id": "4f7d1d1a-a565-40b4-955a-ff0257d7ed3b", "ipv4": "1.1.1.1", "ipv6": "", "port": 9000 }, { "location_id": 456, "region_id": "22bf10c2-2565-4e75-848f-d2df25210896", "ipv4": "", "ipv6": "2606:4700:4700::1001", "port": 9000 } ] }
304 Not Modified
The request was successful, and the generated or cached ETag matches the If-None-Match header. Refer to the ETag support documentation.- Response Type: N/A
- Response Body: None
403 Forbidden
The request was not authenticated.- Response Type: text/plain
- Response Body: “access denied for REMOTE_ADDR” (where REMOTE_ADDR is the network address of the caller)
404 Not Found
The givenfleet_id
- Response Content Type: application/JSON
- Response Body: JSON object with details about the error
Property | Type | Value |
---|---|---|
success | Boolean | Legacy: indicates whether the operation succeeded (always false) |
error | Boolean | Legacy: indicates whether the operation had an error (always true) |
error_code | Integer | Multiplay Hosting-defined error code (generally -1) |
error_message | String | Information on the reason for the error |
messages | Array | Multiplay Hosting-defined array of status messages (none currently defined) |
{ "success": false, "error": true, "error_code": -1, "error_message": "fleet does not exist", "messages": [] }
500 Internal Server Error
A server-side error has occurred.- Response Content Type: application/JSON
- Response Body: Similar to the response for 404 Not Found, but with an that contains information about the reason for the internal error.
error_message
ETag support
A successful (2xx) Discovery service response includes a standard HTTP entity tag in the ETag response header. If this tag is provided in a subsequent request in theIf-None-Match
Discovery best practices
- The Discovery service should be contacted once before the first QoS check after running the game. Cache the results locally, and use the provided ETag in subsequent requests whenever possible. This process can improve response times, reduce bytes over the wire, and reduce the load on the server when no changes are made to the servers list.
- Discovery should be performed at no less than 20 minute intervals.
- There are no duplicate entries in the “servers” array in the response. However, note that the same QoS server IP might serve multiple overlapping regions. Accordingly, you can isolate each unique server to be contacted only once, and then use the results for each region that uses that server. This process can reduce the amount of time that is spent sending requests and waiting for responses.
- If the game client only supports IPv4 connections, contact the IPv4 QoS servers. Conversely, if the game client only supports IPv6, contact the IPv6 QoS servers. If the game client is IP version-agnostic, then you can contact either or both versions of the QoS servers.
- The 4xx and 5xx responses contain legacy properties in the JSON object that should not be relied upon. Instead, only focus on the HTTP status code and the error_message field.
QoS server
Use the QoS server to determine connection quality to a specific dedicated server region. Connection quality is defined as a combination of network latency and packet loss. The QoS server works with a very simple UDP protocol. Developers define almost all of the data that is included with the request, and the response includes an exact copy of the data that is sent. For example, you can include timestamps, sequence counters, and unique identifiers in the request with the goal of computing latency, packet loss, and detecting duplicate packets on the response. You can also directly send the QoS request to a QoS server that is identified by the Discovery service at the provided IP address and port. There is currently no authentication that is required for sending QoS request packets. Sending any valid request should generate a response, assuming that the server has capacity, that the client is not banned, and that the request or response is not lost in transit.QoS Request
The QoS Request packet is sent from the client to the QoS server. The payload (after the IP and UDP headers) is defined as demonstrated in the following example:Name | Size | Value | Notes |
---|---|---|---|
Type | 1 byte | 0x59 | Magic value that identifies the packet as a valid QoS Request Packet. |
VerAndFlow | 1 byte | [0x00-0xF0] | Upper 4 bits are reserved for the Version. Version starts at zero and increments by one for each version of the packet format (allows for a maximum of 16 versions). The packet format documented here is version
The lower 4 bits are reserved for the flow control. For a QoS Request, the flow control must always be set to
|
Title | varies | varies | The title of the game requesting QoS. The first byte of the title is the length of the title block, including the length byte. So the title “A” would have a length byte of 2: one for the length, and one for the letter A. The title itself is an array of UTF8-encoded bytes with no NULL-termination for the string. For example, the title ワオ would be encoded as [0x07, 0xe3, 0x83, 0xaf, 0xe3, 0x82, 0xaa], where 0x07 is the length. |
Custom | varies | varies | This is custom data that is echoed back to the client in a response. |
Name | Size | Value | Notes |
---|---|---|---|
Sequence | 1 byte | [0x00-0xFF] | This value monotonically increases from 0 for each QoS request packet that is generated in a particular QoS check session. A session covers all of the packets that are issued for a single QoS (latency + packet loss) check with the same identifier. |
Identifier | 2 bytes | varies | A unique value that is used for the duration of the QoS check. Each request packet in a session uses the same identifier. |
Timestamp | 8 bytes | varies | The number of milliseconds from the epoch on the client for when the packet was crafted. It displays in the response so the client can determine end-to-end latency to roughly millisecond accuracy. |
QoS Response
The QoS Response packet is a mostly byte-for-byte copy of the payload in the QoS Request packet with the magic value set to the QoS Response packet type, the title removed, and any flow control data set. Because the data in the QoS packet is only useful to the client that sent the request, echoing the data received back to the client allows the client to compute the overall QoS benchmark. The payload (not including the IP and UDP headers) is between 2 and 1500 bytes.Name | Size | Value | Notes |
---|---|---|---|
Type | 1 byte | 0x95 | Magic value that identifies the packet as a valid QoS Response Packet. |
VerAndFlow | 1 byte | [0x00-0xFF] | Upper 4 bits are reserved for the Version. Version starts at zero and increments by one for each version of the packet format (allows for a maximum of 16 versions). The packet format documented here is version
The lower 4 bits are reserved for the flow control.
|
Custom | varies | varies | Custom data from the request that is echoed back to the client in the response. |
Flow control
Flow control is defined as instructions from the server to the client to voluntarily back-off from sending requests, or to inform the client that they have temporarily been restricted from receiving responses. The QoS protocol was designed to never require the developer to manage the byte order of data. Header data is all byte-based, and custom data is echoed in the same byte-order in which it was sent. Accordingly, instead of flow control using a potentially multi-byte value for the amount of time to back-off, the contract is that each unit of flow control represents 2 minutes of time, and the server indicates how many units to apply. The client should pad this amount with a reasonable buffer of time (for example, 15-30 seconds) to account for latency and processing time on the server. A voluntary back-off is the server asking the client to stop sending requests for a certain amount of time. The server still responds to the client during a voluntary back-off. However, continuing to send requests to a server that has asked for a voluntary back-off might result in the client being temporarily banned. Note that the voluntary back-off is currently unused, so all non-zero flow control is of the1nnnb
variety, which indicates a temporary ban for the client.
During a ban, all QoS requests from the banned client go unanswered. The ban could be enforced because the server thinks the client is sending too many packets in a short amount of time (refer to QoS best practices, or because the server has exceeded capacity and is enforcing an algorithm to reduce usage to get back under capacity. In this scenario, the client was chosen at random to be turned away for a time. Note that the ban is server-specific, and being temporarily banned from one QoS server does not prevent the client from contacting other QoS servers.
QoS check best practices
- A single QoS check should involve sending between 10-20 requests to a QoS server and then waiting for responses. The requests can be batched and sent in succession without first waiting for a response. The time to wait for responses depends on how latency tolerant the game is and how much time is dedicated to checking QoS in the game flow.
- Instead of checking one server at a time, developers can send requests to several servers at once and then wait for all responses from all servers. Note that contacting too many servers simultaneously might introduce latency when waiting for the responses to be read off of the sockets.
- While the packet length of a request is 1500 bytes (plus the IP and UDP headers), it is generally not advisable to send such large UDP packets. A UDP frame that is subjected to fragmentation and reassembly has a much higher chance of being discarded by intermediate routers along the path.
- Consider padding the QoS request with enough data to approximate the size of the game data packets to help identify servers on routes that are discarding larger UDP packets.
- QoS can be refreshed at periodic intervals when outside of an online game session, but there should be a minimum of 3 minutes between automatic checks. When in an online game session, QoS checks must be stopped to avoid unnecessary network traffic on the client and load on the QoS servers.
- Refrain from sending other network traffic when checking QoS. The time spent processing other network traffic might skew results or add additional latency.
- Rechecking QoS inside of the recommended minimum interval of 3 minutes is acceptable under specific circumstances, such as changing networks from cellular to Wi-Fi or Ethernet. If this is happening so frequently that most or all QoS checks are happening inside of the minimum interval, consider waiting until the underlying network becomes more stable before you resume performing QoS checks to avoid getting temporarily banned by the server.
Identifying packet loss
To identify packet loss, send a static number of requests and then count the number of responses. If fewer responses arrive than the number of sent requests, some packets were lost.Computing latency
To compute latency, include a timestamp in your request packets, and then read that timestamp in each response packet and compare it against the current time. The difference is the computed latency.Identifying duplicate responses
To identify duplicate reponses, include a unique value (for example, a sequence byte) in each of your request packets. When recording responses, check the unique values against those that are used in the requests. If a response contains a sequence that has already been accounted for, it is a duplicate and can be discarded without counting against the response totals or latency.Identifying stale responses
To identify stale responses, include a static value in each set of requests that are sent to a single server. When reading responses, if that static value is different, then it is a stale response from another QoS check. Rotate the value each time you initiate a new check with a new set of requests.Example QoS flow
A simplified QoS flow might look like the following example: