文档

支持

Vivox Unity SDK

Vivox Unity SDK

Real-time recording

Enable real-time recording in your Vivox project.
阅读时间2 分钟最后更新于 23 天前

Overview

Real-time recording is a feature designed to stream data in real time from the Vivox backend to external services. By leveraging Vivox real-time recording, developers can capture voice data for purposes such as analytics, moderation, or storage.

Get started

To get started with Vivox real-ime recording reach out to your technical account manager or create a ticket to request the feature for your project.

Websocket specification

In order to accept real-time recording data, your service must implement a websocket server that can handle incoming connections from the Vivox real-time recording service.

Set up the websocket connection

When establishing a websocket connection, your server must validate the authentication token sent in the
Authorization
header. Once you have a token and WSS endpoint you must provide it to your technical account manager to configure the Vivox real-time recording service to connect to your websocket server.

Data flow

All audio is delivered in 15 second chunks. You will receive two types of messages via a websocket connection:
  1. Metadata messages: These messages contain information about the file, such as User URI, channel URI, timestamp, and other relevant metadata.
  2. Audio data messages: These messages contain the actual audio data in
    OGG
    format with 15 seconds of audio data.
Immediately following each metadata message, you will receive a binary audio data message containing the corresponding audio data. The following diagrams shows the sequence of events and information flow between the real-time recording service and your service:

JSON metadata message format

The metadata message is sent as a JSON object with the following structure:
{ "type": "metadata", "timestamp": 1705270000, // Unix timestamp indicating the start of the 15-second interval the audio belongs to "speaker": "sip:.issuer.alice.@domain.vivox.com", // The user the audio belongs to "listeners": ["sip:.issuer.bob.@domain.vivox.com", "sip:.issuer.charlie.@domain.vivox.com"], // List of users who heard the speaker "channel": "sip:confctl-g-issuer.general-chat@domain.vivox.com", // Channel the audio was spoken into "muted": false, // Indicates if the speaker was muted during this interval "kicked": false // Indicates if the speaker was kicked from the channel during this interval.}

OGG audio format specification

All audio data is delivered in OGG Opus format with the following specifications:

Property

Value

Container FormatOgg
Audio CodecOpus
Channels1 (Mono)
Channel LayoutMono
Sample Rate48 kHz
CompressionLossy

Join and leave events

Join and leave events are sent as JSON objects with the following structure:
{ "type":"join", "timestamp":1677610000, // This timestamp is the time the user joined the channel NOT the minute interval like the audio data "speaker":"sip:.alice.my-issuer.@domain.com", "channel": "sip:confctl-g-my-issuer.mychannel1@domain.vivox.com"}
{ "type":"leave", "timestamp":1677610000, // This timestamp is the time the user joined the channel NOT the minute interval like the audio data "speaker":"sip:.alice.my-issuer.@domain.com", "channel": "sip:confctl-g-my-issuer.mychannel1@domain.vivox.com"}