Audio Connector Server SDK

Overview

This guide explains how to install and configure the Vonage Audio Connector Server SDK to build a server-side WebSocket endpoint that sends and receives real-time PCM audio from a Vonage Video session.

Use this SDK to connect live session audio to third-party AI services—for example, to build a conversational AI assistant, run live transcription, or perform real-time audio analysis.

Before You Start

Before you install the SDK, ensure you have:

  • Python installed on your server
  • A Vonage account with Video API access
  • A Vonage Video session ID and a valid session token
  • (Optional) An SSL certificate for production WebSocket deployments

Install the SDK

Install the package from PyPI:

pip install vonage-audio-connector-server

Configure and Start the Server

  1. Create an AudioConnectorServerConfig with your host, port, SSL context, and lifecycle callbacks:

    video = Video()
    
    config = AudioConnectorServerConfig(
        host="localhost",
        port=8765,
        ssl=ssl_context,
        on_start=on_start,
        on_stop=on_stop,
        on_connect=on_connect
    )
    
  2. Start the server:

    server_handle = await video.start_audio_connector_server(config)
    

Handle Connection Events

Inside your on_connect callback, register message, disconnect, and error handlers for each client connection:

async def on_connect(client):
    client.set_handler(
        on_message=on_message,
        on_disconnect=on_disconnect,
        on_error=on_error
    )

async def on_message(message):
    print(f"Received Message")

async def on_disconnect():
    print("Client Disconnected.")

async def on_error(error):
    print(f"Error Occurred: {error}")

Send Audio Back to the Session

Use the client object inside your handlers to return data to the Video session.

To send a JSON control message:

await client.send_json_packet(string_json)

To send PCM audio back into the session:

The SDK manages frame timing and buffering based on bytes_per_sample and frames_ms. Set flush_buffer=True to discard any buffered audio before appending new data.

await client.send_audio_buffer(
    audio_data,
    bytes_per_sample=640,
    frames_ms=20,
    pad_last_frame=True,
    flush_buffer=False
)

Other available client methods:

client.info()         # Retrieve connection metadata from request headers
client.flush_buffer() # Flush buffered packets to the WebSocket client
client.disconnect()   # Close the WebSocket connection

See Also