https://d226lax1qjow5r.cloudfront.net/blog/blogposts/live-captions-api-is-now-ga/live-captions-ga.png

Live Captions API Is Now GA!

Published on September 4, 2023

The Vonage Live Captions API is now generally available. This feature within the Video API helps improve accessibility, enhances user experience in noisy environments, and enables real-time captioning for better communication. In this article, we’ll cover the key benefits, how the API works, and how to get started.

Why Offer Live Captions?

Accessibility: It can not be assumed that everyone participating in a call can hear.

Noisy Environments: Even with the best noise-canceling headphones/earbuds, a loud area can be challenging.

Regain context: Missed what someone just said? You can most likely still see the caption in the feed.

User Preference: According to a poll from YouGov, a sizeable amount of people prefer to have captions/subtitles on. I know I do.

(Bonus) Translation: It is just one more step to turn a caption into the language of the viewer.


Survey table from YouGov asking if people prefer subtitles on or off when watching shows in their native language. Among all UK adults, 28% prefer subtitles on, 65% off. In the 18–24 age group, 61% prefer subtitles on, while in the 50–64 group, only 13% do. Preferences shift toward subtitles off with increasing age.YouGov poll results showing subtitle preferences when watching TV or movies in one’s native language, broken down by age group.

How the Live Captions API Works

The Live Captions API receives the audio streams (from both Video and SIP dial-in participants) that pass through the Media Router and forwards them to a transcription service.

Diagram of a transcription flow where WebRTC client A and B send audio to a Video Media Router, which forwards A’s and B’s audio streams to AWS Transcribe. AWS returns transcribed text of each stream separately back to the router.Architecture diagram showing audio streams from two WebRTC clients routed through a Video Media Router to AWS Transcribe, which returns individual transcribed text streams.

Advantages for Developers

  • Live Captions are enabled by default for all projects.

  • Your application is already sending media streams to the Media Router.

  • No need to further strain your users' computers and/or mobile devices by sending another stream to be transcribed.

  • No third-party transcription library/service to learn and implement.

Enabling Live Captions in Your Application

A more detailed description can be found in the Live Captions documentation.

First, make a POST request to the Live Captions API endpoint with your Vonage credentials. Then you can use any of our Client SDKs to start or stop sending and receiving captions.

Give It a Try

Instantly deploy a Basic Live Captions API demo to Stackblitz and point to a running server URL in config.js. The source code can be found in the GitHub repository.

Conclusion

The Live Captions API is a powerful tool to enhance accessibility and improve user experience in your video applications. Start building with it today, and don’t hesitate to share your feedback or questions.

We would love to hear from you. Please reach out to us on our Community Slack Channel. If you are on X, follow the VonageDev account to receive the latest updates.

Share:

https://a.storyblok.com/f/270183/400x400/04765919bb/zachary-powell-1.png
Zachary PowellSr Android Developer Advocate