
Live Captions API Is Now GA!
The Vonage Live Captions API is now generally available. This feature within the Video API helps improve accessibility, enhances user experience in noisy environments, and enables real-time captioning for better communication. In this article, we’ll cover the key benefits, how the API works, and how to get started.
Why Offer Live Captions?
Accessibility: It can not be assumed that everyone participating in a call can hear.
Noisy Environments: Even with the best noise-canceling headphones/earbuds, a loud area can be challenging.
Regain context: Missed what someone just said? You can most likely still see the caption in the feed.
User Preference: According to a poll from YouGov, a sizeable amount of people prefer to have captions/subtitles on. I know I do.
(Bonus) Translation: It is just one more step to turn a caption into the language of the viewer.
YouGov poll results showing subtitle preferences when watching TV or movies in one’s native language, broken down by age group.
How the Live Captions API Works
The Live Captions API receives the audio streams (from both Video and SIP dial-in participants) that pass through the Media Router and forwards them to a transcription service.
Architecture diagram showing audio streams from two WebRTC clients routed through a Video Media Router to AWS Transcribe, which returns individual transcribed text streams.
Advantages for Developers
Live Captions are enabled by default for all projects.
Your application is already sending media streams to the Media Router.
No need to further strain your users' computers and/or mobile devices by sending another stream to be transcribed.
No third-party transcription library/service to learn and implement.
Enabling Live Captions in Your Application
A more detailed description can be found in the Live Captions documentation.
First, make a POST request to the Live Captions API endpoint with your Vonage credentials. Then you can use any of our Client SDKs to start or stop sending and receiving captions.
Give It a Try
Instantly deploy a Basic Live Captions API demo to Stackblitz and point to a running server URL in config.js. The source code can be found in the GitHub repository.
Conclusion
The Live Captions API is a powerful tool to enhance accessibility and improve user experience in your video applications. Start building with it today, and don’t hesitate to share your feedback or questions.
We would love to hear from you. Please reach out to us on our Community Slack Channel. If you are on X, follow the VonageDev account to receive the latest updates.