Boosting Voice Calls with Vonage API Human/Machine Detection and Flask
Published on December 7, 2023

Introduction

In specific business scenarios, the necessity arises to establish voice-based communication with customers, wherein time constraints become a critical factor when delivering voice messages efficiently, especially when the available workforce is limited. When an agent initiates a call by dialing a phone number, they must exercise patience until the call is answered by the receiving party, which could be a human or voicemail system. These interactions consume valuable time and resources.

Leveraging the Vonage Voice API and its augmented capabilities encompassing human, voicemail, and beep detection, automating the entire communication process with minimal code implementation is feasible. Furthermore, the sophisticated machine detection feature guarantees the complete delivery of the intended message. The following tutorial guides you in commencing the automation of voice calls, complete with identifying human or voicemail entities. The source code for this application can be found on Github.

Table of Contents

  1. Prerequisites

  2. Place a Voice Call and Follow Events

  3. Adding Human/Voicemail Detection to the Call

  4. Altering Call Progression Based on Human/Voicemail Detection

  5. Wrapping Up with Real World Example

  6. Conclusion

1. Prerequisites

There are specific prerequisites to fulfill before using the Vonage Voice API. This includes creating a Vonage Developer account, obtaining a virtual number, setup a Python environment along with the Flask application and ngrok Ingress application. The following sections describe these prerequisites in detail.

Create a Vonage Developer Account

Vonage is a leading communications provider that gives developers access to APIs to send voice, video, and messages. By signing up for a free account, you will have access to free credit and straightforward documentation to use Vonage APIs. Once you have obtained the API key and secret, follow the documentation to create a voice application using Dashboard and generate a JWT which will be required in upcoming steps. You can read more about generating JWT here.

Vonage API Account

To complete this tutorial, you will need a Vonage API account. If you don’t have one already, you can sign up today and start building with free credit. Once you have an account, you can find your API Key and API Secret at the top of the Vonage API Dashboard.

Obtain a Virtual Phone Number

Once sign-up is complete, obtain a virtual phone number from Vonage website. In some countries, phone numbers are required to place a call, and the ‘Unknown’ Caller ID is not allowed to place a call. Hence, obtaining a virtual phone number at a meager price is highly recommended.

Setup Flask Server and ngrok Ingress Application

A web server that can communicate with a published endpoint is required to receive events from call progress and subsequently issue action items. Flask is a web application framework with a small footprint and uses Python to create web-based applications. Set up the Flask server in your environment using the official documentation.

To access a locally-run Flask application, you need a public URL accessible through the Internet. Here is the minimal set of instructions to check if Python is installed and set up Flask on your machine:

python --version pip install Flask flask --version

You can use ngrok to host the web server on your local machine and send/receive requests from the Web. It is recommended that the latest version of ngrok is installed. Here is a minimal set of instructions to set up ngrok on your Mac and check its version:

brew install ngrok/ngrok/ngrok ngrok version

To jump-start, here is the minimum Python code to start the flask server and send a ‘Hello World’ greeting:

from flask import Flask, request

app = Flask(__name__)

@app.route("/", methods=["GET", "POST"])
def callback_listener():
    print(f"event received --> {request.data}")
    return ("Hello world!", 200)

Save the above code in callback.py and run it using the following command:

flask --app callback run --host=0.0.0.0 --debug

You can now point your browser to the public IP address or the ngrok ingress URL to see if the application is accessible through the Internet and ready for the next step.

2. Place a Voice Call and Follow Events

Now that you have completed the prerequisites, it is time to place the first voice call and follow the progress of that voice call. Several ways to place a call include using SDK in Node.js, Python, or .NET. The easiest way is using the curl command, which we will demonstrate here.

The example is taken from here and tweaked for ‘action: talk’. To place a call, either use the curl command or use Postman by using the import option:

curl --location 'https://api-us.vonage.com/v1/calls' \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "to": [ { "type": "phone", "number": "" } ], "from": { "type": "phone", "number": "" }, "event_url": [""], "ncco": [ { "action": "talk", "text": "This is a test call." } ] }'

Replace the following with:

  • YOUR_JWT = JWT generated in the previous step

  • YOUR_MOBILE_NUMBER = Mobile number to receive the voice call

  • YOUR_VIRTUAL_NUMBER = Virtual number obtained in the previous step

  • YOUR_CALLBACK_URL = ngrok ingress URL or public IP address of Flask server application

If you look at the Flask application logs, you will see the details of call progression from started to ringing.

event received --> b'{"headers":{},"from":"YOUR_VIRTUAL_NUMBER","to":"YOUR_MOBILE_NUMBER","uuid":"2dd98ea6-2518-4e54-8f3a-a8c532f131d6","conversation_uuid":"CON-51982b08-5337-480f-be09-81c3cda3f885","status":"started","direction":"outbound","timestamp":"2023-10-20T13:57:42.895Z"}' event received --> b'{"headers":{},"from":"YOUR_VIRTUAL_NUMBER","to":"YOUR_MOBILE_NUMBER","uuid":"2dd98ea6-2518-4e54-8f3a-a8c532f131d6","conversation_uuid":"CON-51982b08-5337-480f-be09-81c3cda3f885","status":"ringing","direction":"outbound","timestamp":"2023-10-20T13:57:42.895Z"}' event received --> b'{"start_time":null,"headers":{},"rate":null,"from":"YOUR_VIRTUAL_NUMBER","to":"YOUR_MOBILE_NUMBER","uuid":"2dd98ea6-2518-4e54-8f3a-a8c532f131d6","conversation_uuid":"CON-51982b08-5337-480f-be09-81c3cda3f885","status":"answered","direction":"outbound","network":null,"timestamp":"2023-10-20T13:57:47.191Z"}' event received --> b'{"headers":{},"end_time":"2023-10-20T13:57:50.000Z","uuid":"2dd98ea6-2518-4e54-8f3a-a8c532f131d6","network":"23410","duration":"3","start_time":"2023-10-20T13:57:47.000Z","rate":"0.10000000","price":"0.00500000","from":"YOUR_VIRTUAL_NUMBER","to":"YOUR_MOBILE_NUMBER","conversation_uuid":"CON-51982b08-5337-480f-be09-81c3cda3f885","status":"completed","direction":"outbound","timestamp":"2023-10-20T13:57:49.611Z"}'

Congratulations on placing your first automated voice call. If you cannot place a call, there can be many reasons for not receiving a call on your mobile phone. It might be because the user was not authenticated, or JWT was not appropriately minted, or because the ‘from’ field was not set correctly. You can always troubleshoot call issues using the data from the response of the command and Voice Inspector tool.

3. Adding Human/Voicemail Detection to the Call

Now that you can place a call, it is time to add advanced features to detect who is at the other end of the call. There are several possible scenarios for picking up the call:

  1. A human

  2. A voicemail greeting

  3. A voicemail greeting followed by a beep

With the Human/Voicemail detection feature, each case can be handled specifically to properly deliver the message in voice calls.

To add the human/voice recognition, update the curl command with the ‘advanced_machine_detection’ option (details about this parameter can be found here):

curl --location 'https://api-us.vonage.com/v1/calls' \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "to": [ { "type": "phone", "number": "" } ], "from": { "type": "phone", "number": "" }, "advanced_machine_detection": { "behavior": "continue", "beep_timeout": "45" }, "event_url": ["YOUR_CALLBACK_UR"], "ncco": [ { "action": "talk", "text": "This is a test call." } ] }'

If you pick up the call and say ‘hello’, the voice will be interpreted as a human voice, and an event ‘status: human’ will be generated and sent in the callback of call progress, which you can see in the Flask server application logs:

event received --> b'{"call_uuid":"47f22bfd-ddb3-4a02-88be-9cb9673e0ae9","from":"","to":"","status":"human","conversation_uuid":"CON-d8ae51d1-6b93-426d-bbd0-2b1c059036ca","timestamp":"2023-10-20T14:34:43.014Z"}'

Voicemail greetings are interpreted as ‘status: machine’, and if a beep is detected after voicemail, an additional parameter ‘sub_state: beep_start’ will be sent in the callback. All these events can be summarised as below:

  1. If call is picked up by a person -> "status":"human"

  2. If call goes to a voicemail that doesn’t have beep afterwards -> "status":"machine" and then “sub_state”: “beep_timeout” when the beep timeout occurs

  3. If call goes to voicemail with beep -> first "status":"machine" and then “sub_state”: “beep_start” when the beep is detected

Beep detection is a helpful feature as it will precisely let you know when the voicemail recording starts. With beep detection, the message left in the recording will be truncated and might be clear when played by the listening person, as it will miss the part of the message before the beep.

4. Altering Call Progression Based on Human/Voicemail Detection

Let’s use voicemail recognition capabilities to alter the course of call progression. For a simple scenario, let’s assume you want to play specific messages according to human/voicemail detection. For example:

  1. Play a message “I am talking to a human” when a person is detected

  2. Play a message “I am talking to voicemail before the beep” when voicemail is detected

  3. Play a message “I am talking to voicemail after the beep” when the beep is detected

Note that you already receive voicemail detection results in call progression via callbacks; it is just a matter of processing the information and adding the decision-making logic into your Flask code. Here is the updated Flask code that will be able to process voicemail detection and generate the expected response:

from flask import Flask, request

app = Flask(__name__)

play_to_human = "I am talking to human."
play_to_voicemail = "I am talking to voicemail before the beep."
play_after_beep = "I am talking to voicemail after the beep."

@app.route("/", methods=["GET", "POST"])
def callback_listener():
    ncco, message_to_play = "", ""
    if request.is_json and "status" in request.get_json():
        req_json = request.get_json()
        print(f'status received --> {req_json["status"]}')
        if req_json["status"] == "machine" and "sub_state" not in req_json:
            message_to_play = play_to_voicemail
        elif req_json["status"] == "machine" and "sub_state" in req_json:
            message_to_play = play_after_beep
        elif req_json["status"] == "human":
            message_to_play = play_to_human
        ncco = [{ "action": "talk", "text": message_to_play, "loop": 3 }] if message_to_play != "" else ""
        print(f'response to send --> {ncco}')
    else:
        print(f'event received --> {request.data}')
    return (ncco if ncco != "" else "", 200)

Place the call again using the curl command with the advanced_machine_detection option:

curl --location 'https://api-us.vonage.com/v1/calls' \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "to": [ { "type": "phone", "number": "" } ], "from": { "type": "phone", "number": "" }, "advanced_machine_detection": { "behavior": "continue", "beep_timeout": "45" }, "event_url": ["YOUR_CALLBACK_UR"], "ncco": [ { "action": "talk", "text": "This is a test call." } ] }'

In the above example, the call will be started with a simple talk action:

{ "action": "talk", "text": "This is a test call." }

As the call progresses, the talk action will be replaced by another talk action, depending on the human/voicemail detection. i.e.:

In case if call is picked up by human (i.e. status: human): { "action": "talk", "text": "I am talking to human.", "loop": 3 } In case if call goes to voicemail before the beep is detected (i.e status: machine): { "action": "talk", "text": "I am talking to voicemail before the beep.", "loop": 3 } In case if call goes to voicemail and beep is detected (i.e. status:machine and sub_state: beep_start): { "action": "talk", "text": "I am talking to voicemail after the beep.", "loop": 3 }

5. Wrapping Up with Real World Example

Finally, let’s act on this enhanced voice-calling capability by picking up a real-world scenario. Imagine you are running a voice-calling campaign and are only interested in taking calls with a real person. So, if a person picks up the call, it is connected to your phone number. Otherwise, leave a message after the beep if a call goes to voicemail. To summarize the use case:

  1. Connect the campaigner's phone to talk to the person attending the phone

  2. Otherwise, leave a message to voicemail after the beep: “We tried to reach you; please contact us immediately.”

Here is the updated Flask code that will be able to process voicemail detection and generate the expected response:

from flask import Flask, request

app = Flask(__name__)

action_when_human = { "action": "connect", "from": "<your_virtual_number>", "endpoint": [{ "type": "phone", "number": "<call_campaigner_phone_number>" }] }
action_when_beep = { "action": "talk", "text": "We tried to reach you, please contact us immediately."}

@app.route("/", methods=["GET", "POST"])
def callback_listener():
    ncco = ""
    if request.is_json and "status" in request.get_json():
        req_json = request.get_json()
        if req_json["status"] == "machine" and "sub_state" in req_json and req_json["sub_state"] == "beep_start":
            ncco = [action_when_beep]
            print(f'status received --> {req_json["status"]}:{req_json["sub_state"]}')
            print(f'response to send --> {ncco}')
        elif req_json["status"] == "human":
            ncco = [action_when_human]
            print(f'status received --> {req_json["status"]}')
            print(f'response to send --> {ncco}')
    else:
        print(f'event received --> {request.data}')
    return (ncco if ncco != "" else "", 200)
</call_campaigner_phone_number></your_virtual_number>

In the code, replace YOUR_VIRTUAL_NUMBER with a virtual number obtained previously and CALL_CAMPAIGNER_PHONE_NUMBER with a number you wish to be connected to when a person picks a call.

Let’s test the code by placing a call using the curl command:

curl --location 'https://api-us.vonage.com/v1/calls' \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "to": [ { "type": "phone", "number": "" } ], "from": { "type": "phone", "number": "" }, "advanced_machine_detection": { "behavior": "continue", "beep_timeout": "45" }, "event_url": ["YOUR_CALLBACK_UR"], "ncco": [ { "action": "talk", "text": "Please wait while we connect you to an agent." } ] }'

6. Conclusion

By following the steps in this tutorial, you’ve successfully created an enhanced voice-calling capability to detect whether a human, voicemail, or voicemail picks the call with a beep. At the same time, you’ve used the Flask application to decide the course of actions in the call dynamically. Using these simple code snippets, you can quickly drive call campaigns, make bulk calls, and send voice messages correctly without human interaction. Again, the source code for this application can be found on Github.

Additional concepts and documentation related to Machine Detection API can be found at Advanced Machine Detection.

Have questions or feedback about this tutorial? Share your thoughts with us on Twitter or our Vonage Community Slack channel quoting this article for quick response. You can also connect with me on Twitter. Good luck and happy coding!

Atique Khan

Atique is a computer graduate and proficient Python developer with a passion for exploring new technologies. With a strong background in programming and system engineering, he holds over 10 years of experience in automation, testing, and integration. His interests span single-board computers, software-defined radios, and continuous experimentation with generative AI tools.

Ready to start building?

Experience seamless connectivity, real-time messaging, and crystal-clear voice and video calls-all at your fingertips.

Subscribe to Our Developer Newsletter

Subscribe to our monthly newsletter to receive our latest updates on tutorials, releases, and events. No spam.