Swift

Capturing video frames

This example shows you how to use a custom video capturer using the device camera as the video source. To use the camera implementation, instantiate the camera capturer in your VonageVideoManager:

Initializing and configuring the video capturer

The initializer calls size(from:) to determine resolution and sets up a serial dispatch queue for capturing images, in order to not affect the UI queue.

The implementation of initCapture uses the AVFoundation framework to set the camera to capture images. It creates an AVCaptureSession, sets the input device, and configures an AVCaptureVideoDataOutput:

func initCapture() {
    let session = AVCaptureSession()
    session.beginConfiguration()
    
    // Set device capture
    session.sessionPreset = sessionPreset
    
    guard let videoDevice = AVCaptureDevice.default(for: .video),
          let deviceInput = try? AVCaptureDeviceInput(device: videoDevice) else { return }
          
    self.inputDevice = deviceInput
    if session.canAddInput(deviceInput) {
        session.addInput(deviceInput)
    }
    
    let outputDevice = AVCaptureVideoDataOutput()
    outputDevice.alwaysDiscardsLateVideoFrames = true
    outputDevice.videoSettings = [
        kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange)
    ]
    
    outputDevice.setSampleBufferDelegate(self, queue: captureQueue)
    if session.canAddOutput(outputDevice) {
        session.addOutput(outputDevice)
    }
    
    // ... Frame rate configuration (see below)
}

The frames captured with this method are accessed via the AVCaptureVideoDataOutputSampleBufferDelegate.

The second part of initCapture configures the frame rate:

The bestFrameRate(for:) method returns the best frame rate for the capturing device:

Capturing frames for the publisher's video

The start method starts the AVCaptureSession:

The delegate method captureOutput(_:didOutput:from:) is called when a new video frame is available.

This method performs the following:

  1. Creates an OTVideoFrame instance.
  2. Allocates a memory buffer.
  3. Copies image data from the CVImageBuffer (NV12 format) into the manual buffer. NV12 has two planes (Y and UV), which are copied sequentially.
  4. Tags the frame with a timestamp and orientation.
  5. Calls consumeFrame, passing the frame to the Vonage SDK.