Handle user input with ASR
A code snippet that shows how to handle a user input with Automatic Speech Recognition (ASR). The user says their input at the prompt and their input is acknowledged via a speech-to-text message.
Prerequisites
Create an Application
You can install the CLI with the following command:
Before you can start working with your apps, you need to register your configuration: API Key and Secret. You can find them via the Dashboard, in API Settings. Once set, initialize your account using the following command:
As soon as the CLI is both installed and configured, use it to create a Vonage application using the following command:
The command starts an interactive prompt to ask for the application name, and the capabilities you want to enable - make sure to enable Voice.
When finished, it creates the vonage_app.json file in the current directory containing the Application ID, Application name and private key. It also creates a second file with the private key name app_name.key.
Go to the Application's page on the Dashboard, and define a Name for your Application.

Make sure to click on the Generate public and private key button, and keep the file private.key around.
Then, enable the Voice capability. For the moment, leave everything by default.

Finally, click Save at the bottom of the page.
Rent a Number
You can rent a number using the Vonage CLI. The following command purchases an available number in the United States:
Specify an alternative two-character country code to purchase a number in another country.
In the Dashboard, go to the Buy Numbers page. Make sure to tick Voice in the search filter, and select the country you want to buy a number in.

You can then click the Buy button next to the number you want, and validate your purchase.
Congratulations! Your virtual number is now listed in Your Numbers
Link a Number
Now that you have both an application and a number, you need to link them together.
Replace YOUR_VONAGE_NUMBER with the number you bought and APPLICATION_ID with your application id, then run the following command:
Now that you have both an application and a number, you need to link them together.
Go to the Application page, and click on the application you created earlier.

In the Voice section, click on the Link button next to the number you want to link.
Example
Prerequisites
npm install express body-parserWrite the code
Add the following to asr.js:
const Express = require('express');
const bodyParser = require('body-parser');
const app = new Express();
app.use(bodyParser.json());
const onInboundCall = (request, response) => {
const ncco = [
{
action: 'talk',
text: 'Please say something',
},
{
action: 'input',
type: ['speech'],
eventUrl: [`${request.protocol}://${request.get('host')}/webhooks/asr`],
speech: {
endOnSilence: 1,
language: 'en-US',
uuid: [request.query.uuid],
},
},
];
response.json(ncco);
};
const onInput = (request, response) => {
const speech = request.body?.speech?.results[0]?.text;
const ncco = [
{
action: 'talk',
text: `You said ${speech}`,
},
];
response.json(ncco);
};
app
.get('/webhooks/answer', onInboundCall)
.post('/webhooks/asr', onInput);
app.listen(3000);Run your code
Save this file to your machine and run it:
Prerequisites
Add the following to build.gradle:
implementation 'com.vonage:server-sdk-kotlin:1.1.2'
implementation 'io.ktor:ktor-server-netty'
implementation 'io.ktor:ktor-serialization-jackson'Write the code
Add the following to the main method of the HandleUserInputAsr file:
embeddedServer(Netty, port = 8000) {
routing {
get("/webhooks/answer") {
call.response.header("Content-Type", "application/json")
call.respond(
Ncco(
talkAction("Please say something."),
inputAction {
eventUrl(call.request.path().replace("answer", "asr"))
speech {
language(SpeechSettings.Language.ENGLISH_UNITED_STATES)
}
}
).toJson()
)
}
post("/webhooks/asr") {
val event = EventWebhook.fromJson(call.receive())
call.response.header("Content-Type", "application/json")
call.respond(
Ncco(
talkAction("You said: "+event.speech.results.first().text),
).toJson()
)
}
}
}.start(wait = true)Run your code
We can use the application plugin for Gradle to simplify the running of our application. Update your build.gradle with the following:
apply plugin: 'application'
mainClassName = project.hasProperty('main') ? project.getProperty('main') : ''Run the following gradle command to execute your application, replacing com.vonage.quickstart.kt.voice with the package containing HandleUserInputAsr:
Prerequisites
Add the following to build.gradle:
implementation 'com.vonage:server-sdk:8.15.1'
implementation 'com.sparkjava:spark-core:2.9.4'Write the code
Add the following to the main method of the AsrInput file:
/*
* Route to answer incoming calls.
*/
Route answerCallRoute = (req, res) -> {
TalkAction intro = TalkAction
.builder("Please say something")
.build();
SpeechSettings speechSettings = SpeechSettings.builder()
.language(SpeechSettings.Language.ENGLISH_UNITED_STATES).build();
InputAction input = InputAction.builder()
.type(Collections.singletonList("speech"))
.eventUrl(String.format("%s://%s/webhooks/asr", req.scheme(), req.host()))
.speech(speechSettings)
.build();
res.type("application/json");
return new Ncco(intro, input).toJson();
};
/*
* Route which returns NCCO saying which word was recognized.
*/
Route speechInputRoute = (req, res) -> {
EventWebhook event = EventWebhook.fromJson(req.body());
TalkAction response = TalkAction.builder(String.format("You said %s, Goodbye.",
event.getSpeech().getResults().get(0).getText()
)).build();
res.type("application/json");
return new Ncco(response).toJson();
};
Spark.port(3000);
Spark.get("/webhooks/answer", answerCallRoute);
Spark.post("/webhooks/asr", speechInputRoute);Run your code
We can use the application plugin for Gradle to simplify the running of our application. Update your build.gradle with the following:
apply plugin: 'application'
mainClassName = project.hasProperty('main') ? project.getProperty('main') : ''Run the following gradle command to execute your application, replacing com.vonage.quickstart.voice with the package containing AsrInput:
Prerequisites
Install-Package VonageWrite the code
Add the following to AsrController.cs:
[HttpGet("[controller]/webhooks/answer")]
public IActionResult Answer()
{
var host = Request.Host.ToString();
//Uncomment the next line if using ngrok with --host-header option
//host = Request.Headers["X-Original-Host"];
var request = WebhookParser.ParseQuery<Answer>(Request.Query);
var eventUrl = $"{Request.Scheme}://{host}/webhooks/asr";
var speechSettings = new SpeechSettings { Language = "en-US", EndOnSilence = 1, Uuid = new[] { request.Uuid } };
var inputAction = new MultiInputAction { Speech = speechSettings, EventUrl = new[] { eventUrl } };
var talkAction = new TalkAction { Text = "Please speak now" };
var ncco = new Ncco(talkAction, inputAction);
return Ok(ncco.ToString());
}
[HttpPost("/webhooks/asr")]
public async Task<IActionResult> OnInput()
{
var input = await WebhookParser.ParseWebhookAsync<MultiInput>(Request.Body, Request.ContentType);
var talkAction = new TalkAction();
talkAction.Text = input.Speech.SpeechResults[0].Text;
var ncco = new Ncco(talkAction);
return Ok(ncco.ToString());
}Prerequisites
composer require slim/slim:^3.8Write the code
Add the following to index.php:
use Vonage\Voice\NCCO\Action\Talk;
use Vonage\Voice\NCCO\Action\Input;
use \Psr\Http\Message\ResponseInterface as Response;
use \Psr\Http\Message\ServerRequestInterface as Request;
require 'vendor/autoload.php';
$app = new \Slim\App;
$app->get('/webhooks/answer', function (Request $request, Response $response) {
$uri = $request->getUri();
$url = $uri->getScheme().'://'.$uri->getHost().':'.$uri->getPort().'/webhooks/asr';
$inputAction = new Input();
$inputAction
->setSpeechEndOnSilence(true)
->setSpeechLanguage('en-US')
->setEventWebhook(new Webhook($url))
;
$ncco = new NCCO();
$ncco
->addAction(new Talk('Please say something'))
->addAction($inputAction)
;
return $response->withJson($ncco->toArray());
});
$app->map(['GET', 'POST'], '/webhooks/asr', function (Request $request, Response $response) {
/** @var InputWebhook $input */
$input = Factory::createFromRequest($request);
$ncco = new NCCO();
$ncco->addAction(new Talk('You said ' . $input->getSpeech()['results'][0]['text']));
return $response->withJson($ncco->toArray());
});
$app->run();
Run your code
Save this file to your machine and run it:
Prerequisites
pip install Flask request jsonifyWrite the code
Add the following to handle-user-input-with-asr.py:
#!/usr/bin/env python3
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route("/webhooks/answer", methods=["POST", "GET"])
def answer_call():
ncco = [
{"action": "talk", "text": "Please, tell me something",},
{
"action": "input",
"type": ["speech"],
"eventUrl": [
"{host}{endpoint}".format(
host=request.host_url, endpoint="webhooks/asr"
)
],
"speech": {
"endOnSilence": 1,
"language": "en-US",
"uuid": [request.args.get("uuid")], # Change to request.json.get("uuid") if using POST-JSON webhook format
},
},
]
return jsonify(ncco)
@app.route("/webhooks/asr", methods=["POST", "GET"])
def answer_asr():
body = request.get_json()
if body is not None and "speech" in body:
speech = body["speech"]["results"][0]["text"]
ncco = [
{"action": "talk", "text": "Hello ,you said {speech}".format(speech=speech)}
]
else:
ncco = [{"action": "talk", "text": "Sorry, i don't undertand. Bye"}]
return jsonify(ncco)
if __name__ == "__main__":
app.run(port=3000)Run your code
Save this file to your machine and run it:
Prerequisites
gem install sinatra sinatra-contrib rack-contribWrite the code
Add the following to answer-inbound-call-with-asr.rb:
require 'sinatra'
require 'sinatra/multi_route'
require 'rack/contrib'
use Rack::JSONBodyParser
before do
content_type :json
end
route :get, :post, '/webhooks/answer' do
[
{
action: 'talk',
text: 'Please say something'
},
{
action: 'input',
type: [ 'speech' ],
eventUrl: ["#{request.base_url}/webhooks/asr"],
speech: {
endOnSilence: 1,
uuid: [params[:uuid]],
language: 'en-US'
}
}
].to_json
end
route :post, '/webhooks/asr' do
[{
action: 'talk',
text: "You said #{params["speech"]["results"][0]["text"]}"
}].to_json
end
route :post, '/webhooks/event' do
puts params
end
set :port, 3000Run your code
Save this file to your machine and run it:
Try it out
Call your Vonage Number. When the call is answered you will be asked to say a message. When you are finished, you will then hear your message repeated back to you via speech-to-text.
Further Reading
- Interactive Voice Response (IVR) - Build an automated phone system for users to input information with the keypad and hear a spoken response.