Build a Video Conference App With React-Native and Vonage Video API
Published on November 7, 2023

Having video call functionality in your application seems to be a must-have these days. However, building a native application from scratch and handling Android and iOS different capabilities can be a very long and expensive process.

In this tutorial, we’ll show you how to build a video chat app in React Native. To build this app, we’ll be leveraging the Opentok React Native library with the Vonage Video API.

For a video version of this tutorial, check out the video below:

Want to jump ahead? You can find the code for this tutorial on GitHub.

Prerequisites

To build your React Native video calling app, you’ll need the following:

How to Build a React Native Video Call App (5 Steps)

To build a video calling app with React Native and the Vonage API, follow the steps below:

  1. Set up Your Project

  2. Build the Video Chat Interface

  3. Add the Participant Video Streams

  4. Build the Video Call Toolbar

  5. Optimize the App for Multiple Streams

1. Set up Your Project

To get started, we ’ll build the framework for the video chat app in React Native.

First, clone the following repo: https://github.com/enricop89/multiparty-video-react-native.git

Next, install the required node modules: npm install

For iOS, install the Podfile's dependencies: cd ios/ && pod install

The project structure is composed of the following main files:

  1. Android and iOS folders: is where the native code is compiled and executed.

  2. App.js: is the main file, where the React state, props, event handlers and methods are defined.

  3. config.js: is where the credentials are defined.

The App component will be our landing view when our App is launched. It will have a primary conditional render function to start the video call.

class App extends Component {
  constructor(props) {
    super(props);
    this.apiKey = credentials.API_KEY;
    this.sessionId = credentials.SESSION_ID;
    this.token = credentials.TOKEN;
    this.state = {
      subscriberIds: [], // Array for storing subscribers
      localPublishAudio: true, // Local Audio state
      localPublishVideo: true, // Local Video state
      joinCall: false, // State variable used to start the call
      streamProperties: {}, // Handle individual stream properties,
      mainSubscriberStreamId: null
    }; 
  }
  
  joinCall = () => {
    const { joinCall } = this.state;
    if (!joinCall) {
      this.setState({ joinCall: true });
    }
  };
  
  endCall = () => {
    const { joinCall } = this.state;
    if (joinCall) {
      this.setState({ joinCall: !joinCall });
    }
  };
  
  joinVideoCall = () => {
    return (
      <view style="{styles.fullView}">
        <button onpress="{this.joinCall}" title="JoinCall" color="#841584" accessibilitylabel="Join call">
          Join Call
        </button>
      </view>
    );
  };
  
  render() {
    return this.state.joinCall ? this.videoPartyView() : this.joinVideoCall();
  }
}

The joinCall property on the React state will trigger different views based on the value. When the App is launched, a simple View with a "Join the Call" button will be displayed to the user. The button will trigger a state change and toggle the joinCall value to show the more complex Video Call View.

2. Build the Video Chat Interface

Now that we’ve built a framework for our project, we’ll build the interface for the app itself. We’ll refer to this interface as the Video Call View.

Before digging into the Video Call View, it's worth spending some time exploring the opentok-react-native library.

The library is composed of three main components: OTSession, OTPublisher and OTSubscriber. Each of them will interact with the native layer (iOS and Android), calling the native methods to connect, publish and subscribe.

We will also need to listen to the events fired by those components, especially the session events, such as sessionConnected, sessionDisconnected, streamCreated and streamDestroyed (Documentation: Session Events).

The Video Call View is composed of the following components:

  • Publisher. The publisher view displays the primary user’s video stream.

  • Subscribers. The subscriber views display the video streams of other participants.

  • Toolbar. The toolbar contains buttons for microphone access, camera access, and ending the call.

An example of the multiparty video view being built in this tutorialAn example of the multiparty video view being built in this tutorial

3. Add the Participant Streams

Next, we’ll add video streams from chat participants (the subscribers and the publisher).

To add video streams to the interface, we need to keep track of the subscribers' streams, the primary subscriber, and the local microphone and camera publishing state. The perfect place to store this information is the React State.

constructor(props) {
    super(props);
    this.apiKey = credentials.API_KEY;
    this.sessionId = credentials.SESSION_ID;
    this.token = credentials.TOKEN;
    this.state = {
      subscriberIds: [], // Array for storing subscribers
      localPublishAudio: true, // Local Audio state
      localPublishVideo: true, // Local Video state
      joinCall: false, // State variable used to start the call
      streamProperties: {}, // Handle individual stream properties,
      mainSubscriberStreamId: null
    }; 
  }
  
  this.sessionEventHandlers = {
      streamCreated: (event) =&gt; {
        const streamProperties = {
          ...this.state.streamProperties,
          [event.streamId]: {
            subscribeToAudio: true,
            subscribeToVideo: true,
          },
        };
        this.setState({
          streamProperties,
          subscriberIds: [...this.state.subscriberIds, event.streamId],
        });
      },
      streamDestroyed: (event) =&gt; {
        const indexToRemove = this.state.subscriberIds.indexOf(event.streamId);
        const newSubscriberIds = this.state.subscriberIds;
        const streamProperties = { ...this.state.streamProperties };
        if (indexToRemove !== -1) {
          delete streamProperties[event.streamId];
          newSubscriberIds.splice(indexToRemove, 1);
          this.setState({ subscriberIds: newSubscriberIds });
        }
      }
    }

SubscriberIds array stores the subscribers within a session. Each time we receive a streamCreated event, it means that someone has joined the session and published a stream, so we need to add their streamId on the subscriberIds array.

On the other hand, when we receive the streamDestroyed event, we need to remove the streamId from the subscribers' array.

<view style="{styles.fullView}">
      <otsession apikey="{this.apiKey}" sessionid="{this.sessionId}" token="{this.token}" eventhandlers="{this.sessionEventHandlers}">
        <otpublisher properties="{this.publisherProperties}" eventhandlers="{this.publisherEventHandlers}" style="{styles.publisherStyle}">
        <otsubscriber style="{{" height:="" dimensions.height,="" width:="" dimensions.width="" }}="" eventhandlers="{this.subscriberEventHandlers}" streamproperties="{this.state.streamProperties}">
          {this.renderSubscribers}
        </otsubscriber>
      </otpublisher></otsession>
    </view>

On the video view render function, we need to add OTSession, OTPublisher and OTSubscriber from the opentok-react-native library. On the OTSession we set the credentials and the eventHandler function as props of the component.

4. Build the Video Call Toolbar

Now that the video streams are working, we’ll enable the features on our video call toolbar.

The OTPublisher component will initialize a publisher and publish to the specified session upon mounting. It's possible to specify different properties, such as camera position, resolution, and others (Publisher options list: Publisher Options). In this example App, we will only set this.publisherProperties = { cameraPosition: 'front'};.

Ensure you have enabled both camera and microphone usage by adding the following entries to your Info.plist file (iOS Project):

<key>NSCameraUsageDescription</key>
<string>Your message to user when the camera is accessed for the first time</string>
<key>NSMicrophoneUsageDescription</key>
<string>Your message to user when the microphone is accessed for the first time</string>

Alternatively, for Android, add the following (newer versions of Android–API Level 23 (Android 6.0)–have a different permissions model that is already handled by this library):

<uses-permission android:name="android.permission.CAMERA">
    <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS">
    <uses-permission android:name="android.permission.RECORD_AUDIO">
    <uses-feature android:name="android.hardware.camera" android:required="true">
    <uses-feature android:name="android.hardware.camera.autofocus" android:required="false">
    <uses-feature android:name="android.hardware.microphone" android:required="true">
</uses-feature></uses-feature></uses-feature></uses-permission></uses-permission></uses-permission>

The OTPublisher component also has a streamProperty property which handles the publisher properties passed into the native instance. Using the React State, we can trigger changes to the Publisher instance by updating the this.publisherProperties variable. We use this approach to implement the Toolbar with the mute/unmute functions for the Microphone and Camera. The function implementation is straightforward; it toggles the publishAudio or publishVideo value on the this.publisherProperties and the localPublishAudio and localPublishVideo to adjust the button icon based on the value.

The End Call button has a very similar approach. The endCall function toggles the joinCall value on the State and resets the View to the initial one.

toggleAudio = () =&gt; {
    let publishAudio = this.state.localPublishAudio;
    this.publisherProperties = { ...this.publisherProperties, publishAudio: !publishAudio };
    this.setState({
      localPublishAudio: !publishAudio,
    });
  };

  toggleVideo = () =&gt; {
    let publishVideo = this.state.localPublishVideo;
    this.publisherProperties = { ...this.publisherProperties, publishVideo: !publishVideo };
    this.setState({
      localPublishVideo: !publishVideo,
    });
  };
  
 endCall = () =&gt; {
    const { joinCall } = this.state;
    if (joinCall) {
      this.setState({ joinCall: !joinCall });
    }
  };

<view style="{styles.buttonView}">
  <icon.button style="{styles.iconStyle}" backgroundcolor="#131415" name="{this.state.localPublishAudio" ?="" 'mic'="" :="" 'mic-off'}="" onpress="{this.toggleAudio}">
  <icon.button style="{styles.iconStyle}" backgroundcolor="#131415" name="call-end" onpress="{this.endCall}">
  <icon.button style="{styles.iconStyle}" backgroundcolor="#131415" name="{this.state.localPublishVideo" ?="" 'videocam'="" :="" 'videocam-off'}="" onpress="{this.toggleVideo}">
</icon.button></icon.button></icon.button></view>

5. Optimize the App for Multiple Streams

If you have reached this point, we have implemented the Join Call View, the Session and Publisher component, and the Toolbar.

Next, we will define the View for the different possible number of subscribers. After that, we’ll add features to help optimize video performance in our React Native app.

If we have no users, we are going to display a simple informative text.

If we have only one subscriber, we will display their stream in full-screen mode.

Finally, if we have more than one user, we will show the primary subscriber in the big View (as shown in the mock-up), and the other in a Scroll View component to handle a different number of subscribers.

As the number could grow and challenge our device CPU and network bandwidth, we will implement optimizations on each of the subscribers, such as lowering the resolution and disabling the video for the subscribers that are not visible.

Let's explore the OTSubscriber component to handle the cases described above. First of all, as we want to have control over each subscriber, we would need to implement a render function for the subscribers (custom-rendering-of-streams).

renderSubscribers = (subscribers) =&gt; {
    if (this.state.mainSubscriberStreamId) {
      subscribers = subscribers.filter(sub =&gt; sub !== this.state.mainSubscriberStreamId);
      subscribers.unshift(this.state.mainSubscriberStreamId);
    }
    return subscribers.length &gt; 1 ? (
      &lt;&gt;
        <view style="{styles.mainSubscriberStyle}">
          <touchableopacity onpress="{()" ==""> this.handleSubscriberSelection(subscribers, subscribers[0])}
            key={subscribers[0]}&gt;
            <otsubscriberview streamid="{subscribers[0]}" style="{{" width:="" '100%',="" height:="" '100%'="" }}="">
          </otsubscriberview></touchableopacity>
        </view>

        <view style="{styles.secondarySubscribers}">
          <scrollview horizontal="{true}" decelerationrate="{0}" snaptointerval="{dimensions.width" 2}="" snaptoalignment="{'center'}" onscrollenddrag="{(e)" ==""> this.handleScrollEnd(e, subscribers.slice(1))}
            style={{
              width: dimensions.width,
              height: dimensions.height / 4,
            }}&gt;
            {subscribers.slice(1).map((streamId) =&gt; (
              <touchableopacity onpress="{()" ==""> this.handleSubscriberSelection(subscribers, streamId)}
                style={{
                  width: dimensions.width / 2,
                  height: dimensions.height / 4,
                }}
                key={streamId}
              &gt;
                <otsubscriberview style="{{" width:="" '100%',="" height:="" '100%'="" }}="" key="{streamId}" streamid="{streamId}">
              </otsubscriberview></touchableopacity>
            ))}
          </scrollview>
        </view>
      
    ) : subscribers.length &gt; 0 ? (
      <touchableopacity style="{styles.fullView}">
        <otsubscriberview streamid="{subscribers[0]}" key="{subscribers[0]}" style="{{" width:="" '100%',="" height:="" '100%'="" }}="">
      </otsubscriberview></touchableopacity>
    ) : (<text>No one connected</text>)
  };

We use the conditional rendering in React (https://reactjs.org/docs/conditional-rendering.html) to handle the different cases with zero, one or N subscribers.

Firstly, if there are not subscribers, we fall into the last case, and we display a Text component.

Secondly, if there is one subscriber, we display the subscriber in a full view mode.

Lastly, the most interesting case is when the subscribers are more than one: We have a main subscriber view and a ScrollView component in which we will feed the other subscribers. The first step is to check if we have a mainSubscriberStreamId. If so, we will sort the array to have the primary subscriber as the first element. The remaining subscribers will be displayed in the ScrollView horizontally. The ScrollView component is ideal for our use case, as we can show a relatively high number of subscribers without the need to change the layout, and we can detect how many subscribers are in the scroll view and how many of them are visible.

Group calls on mobile devices could be very challenging, both from the hardware and network point of view. To deliver a good result to the end-user, an App should implement a list of best practices to handle different use cases and layout. In our case, we have a main subscriber view which needs to have the best resolution possible, and the Scroll View component with the remaining subscribers in smaller thumbnails that could be optimized by lowering the received resolution. Opentok SDKs give the developer the opportunity to set the preferred resolution and frame rate for each of the subscriber (setPreferredFrameRate and setPreferredResolution).

We implement the handleSubscriberSelection method to handle the mainSubscriber View and the preferred resolution. The function is on the TouchableOpacity component parent of each of the subscribers.

const mainSubscribersResolution = { width: 1280, height: 720 };
const secondarySubscribersResolution = { width: 352, height: 288 };


  handleSubscriberSelection = (subscribers, streamId) =&gt; {
    let subscriberToSwap = subscribers.indexOf(streamId);
    let currentSubscribers = subscribers;
    let temp = currentSubscribers[subscriberToSwap];
    currentSubscribers[subscriberToSwap] = currentSubscribers[0];
    currentSubscribers[0] = temp;
    this.setState(prevState =&gt; {
      const newStreamProps = { ...prevState.streamProperties };
      for (let i = 0; i &lt; currentSubscribers.length; i += 1) {
        if (i === 0) {
          newStreamProps[currentSubscribers[i]] = { ...prevState.streamProperties[currentSubscribers[i]] }
          newStreamProps[currentSubscribers[i]].preferredResolution = mainSubscribersResolution;
        } else {
          newStreamProps[currentSubscribers[i]] = { ...prevState.streamProperties[currentSubscribers[i]] }
          newStreamProps[currentSubscribers[i]].preferredResolution = secondarySubscribersResolution;
        }
      }
      return { mainSubscriberStreamId: streamId, streamProperties: newStreamProps };
    })
  }

Based on the subscriber selected, the function moves the selected subscriber to the head of the subscribers' array. As mentioned before, the first element on the subscriber array will be displayed in the main View. After that, we need to update the streamProperties of the OTSubscriber component to set the different preferred resolution. We set the maximum resolution (width: 1280, height: 720) for the primary subscriber and a lower resolution for the others ({ width: 352, height: 288 }). If we also want to change the preferred frame rate, based on the layout or use case, we would only need to add the preferredFrameRate property on the streamProperties object.

Finally, we want to optimize the ScrollView component. The ScrollView component could have a high number of subscribers, but can only show two simultaneously. For example, if we have five subscribers, one will be on the main subscriber view; the remaining four will be on the ScrollView. Only two of them are visible in the View, and the remaining ones will be visible only if we scroll horizontally.

The ScrollView component has an event listener called onScrollEndDrag, which is called when the user stops dragging the scroll view and it either stops or begins to glide. We can use this event to understand which subscribers are visible and mute the video of the remaining ones. Muting the video of the non-visible stream will improve the performance of the App, and save CPU consumption and network bandwidth.

handleScrollEnd = (event, subscribers) =&gt; {
    let firstVisibleIndex;
    if (event &amp;&amp; event.nativeEvent &amp;&amp; !isNaN(event.nativeEvent.contentOffset.x)) {
      firstVisibleIndex = parseInt(event.nativeEvent.contentOffset.x / (dimensions.width / 2), 10);
    }
    this.setState(prevState =&gt; {
      const newStreamProps = { ...prevState.streamProperties };
      if (firstVisibleIndex !== undefined &amp;&amp; !isNaN(firstVisibleIndex)) {
        for (let i = 0; i &lt; subscribers.length; i += 1) {
          if (i === firstVisibleIndex || i === (firstVisibleIndex + 1)) {
            newStreamProps[subscribers[i]] = { ...prevState.streamProperties[subscribers[i]] }
            newStreamProps[subscribers[i]].subscribeToVideo = true;
          } else {
            newStreamProps[subscribers[i]] = { ...prevState.streamProperties[subscribers[i]] }
            newStreamProps[subscribers[i]].subscribeToVideo = false;
          }
        }
      }
      return { streamProperties: newStreamProps }
    })
  }

On the onScrollEndDrag event, we have the information about the contentOffset coordinates, which is the point at which the origin of the content view is offset from the origin of the scroll view. We will use this value to understand which streams are currently visible, dividing the content offset by half of the width of the screen (event.nativeEvent.contentOffset.x / (dimensions.width / 2)).

The result will be the first visible subscriber. At this point, we know that the visible streams are the stream in position firstVisibleIndex and firstVisibleIndex + 1. The last step is to loop the subscribers' array and mute the video of the non-visible subscribers.

Conclusion

Congratulations! You’ve just finished building your first video calling app in React Native.

This tutorial shows just how easy it is to add video chat to React Native apps with the help of the Vonage Video API. As video calling become more and more popular, video chat apps like this one offer more and more value to brands and businesses.

If you have any questions or comments, please reach out to us on Twitter or join our Community Slack Channel. You can access the code for this tutorial on GitHub.

Additional Reading

How to Make a Video Chat App With Python and Flask

How to Make a Video Chat App With ASP.NET and Angular

How to Add Video Chat to Next.js Apps

How to Add Video Chat to Firebase Apps

How to Make Phone Calls in Android Apps With React Native

Enrico PortolanGuest Author

Enrico is a former Vonage team member. He worked as a Solutions Engineer, helping the sales team with his technical expertise. He is passionate about the cloud, startups, and new technologies. He is the Co-Founder of a WebRTC Startup in Italy. Out of work, he likes to travel and taste as many weird foods as possible.

Ready to start building?

Experience seamless connectivity, real-time messaging, and crystal-clear voice and video calls-all at your fingertips.

Subscribe to Our Developer Newsletter

Subscribe to our monthly newsletter to receive our latest updates on tutorials, releases, and events. No spam.