
The DepthAI SDK is a powerful tool for building computer vision applications using Luxonis devices. This quickstart guide will help you get started with the SDK.


DepthAI SDK is available on PyPI. You can install it with the following command:

# Linux and macOS
python3 -m pip install depthai-sdk

# Windows
 py -m pip install depthai-sdk

Working with camera

The OakCamera class is a fundamental part of the DepthAI SDK, providing a high-level interface for accessing the features of the OAK device. This class simplifies the creation of pipelines that capture video from the OAK camera, run neural networks on the video stream, and visualize the results.

With OakCamera, you can easily create color and depth streams using the create_camera() and create_stereo() methods respectively, and add pre-trained neural networks using the create_nn() method. Additionally, you can add custom callbacks to the pipeline using the callback() method and record the outputs using the record() method.

Blocking behavior

When starting the OakCamera object, you can specify whether the start() method should block the main thread or not. By default, the start() method does not block the main thread, which means you will need to manually poll the camera using the oak.poll() method.

from depthai_sdk import OakCamera

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')

    while oak.running():
        # this code is executed while the pipeline is running

Alternatively, setting the blocking argument to True will loop and continuously poll the camera, blocking the rest of the code.

from depthai_sdk import OakCamera

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')
    # this code doesn't execute until the pipeline is stopped

Creating color and depth streams

To create a color stream we can use the OakCamera.create_camera() method. This method takes the name of the sensor as an argument and returns a CameraComponent object.

The full list of supported sensors: color; left; right; cam_{socket},color, cam_{socket},mono, where {socket} is a letter from A to H representing the socket on the OAK device. Custom socket names are usually used for FFC devices.

To visualize the stream, we can use the OakCamera.visualize() method. This method takes a list of outputs and displays them. Each component has its own outputs, which can be found in the Components section.

Here is an example which creates color and depth streams and visualizes the stream:

from depthai_sdk import OakCamera

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')
    stereo = oak.create_stereo(resolution='800p')  # works with stereo devices only!
    oak.visualize([color, stereo])

Creating YOLO neural network for object detection

DepthAI SDK provides a number of pre-trained neural networks that can be used for object detection, pose estimation, semantic segmentation, and other tasks. To create a neural network, we can use the OakCamera.create_nn() method and pass the name of the neural network as an argument.

Similarly to the OakCamera.create_camera() method, the OakCamera.create_nn() method returns a NNComponent object.

Here is an example which creates a YOLO neural network for object detection and visualizes the results:

from depthai_sdk import OakCamera

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')
    # List of models that are supported out-of-the-box by the SDK:
    # https://docs.luxonis.com/projects/sdk/en/latest/features/ai_models/#sdk-supported-models
    yolo = oak.create_nn('yolov6n_coco_640x640', input=color)

    oak.visualize([color, yolo])

Adding custom callbacks

Callbacks are functions that are called when a new frame is available from the camera or neural network. OakCamera provides a mechanism for adding custom callbacks to the pipeline using the OakCamera.callback() method.

Here is an example which creates a YOLO neural network for object detection and prints the number of detected objects:

from depthai_sdk import OakCamera

def print_num_objects(packet):
    print(f'Number of objects detected: {len(packet.detections)}')

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')
    yolo = oak.create_nn('yolov6n_coco_640x640', input=color)

    oak.callback(yolo, callback=print_num_objects)


DepthAI SDK provides a simple API for recording the outputs. The OakCamera.record() method takes a list of outputs and a path to the output file. Here is an example which creates a YOLO neural network for object detection and records the results:

from depthai_sdk import OakCamera
from depthai_sdk.record import RecordType

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')
    yolo = oak.create_nn('yolov6n_coco_640x640', input=color)

    oak.record([color, yolo], path='./records', record_type=RecordType.VIDEO)

There are several formats supported by the SDK for recording the outputs:

  1. depthai_sdk.record.RecordType.VIDEO - record video files.

  2. depthai_sdk.record.RecordType.MCAP - record MCAP files.

  3. depthai_sdk.record.RecordType.BAG - record ROS bag files.

You can find more information about recording in the Recording section.

Output syncing

There is a special case when one needs to synchronize multiple outputs. For example, recording color stream and neural network output at the same time. In this case, one can use the OakCamera.sync(). This method takes a list of outputs and returns a synchronized output to the specified callback function. Here is an example which synchronizes color stream and YOLO neural network output:

from depthai_sdk import OakCamera

def callback(synced_packets):

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p')
    yolo = oak.create_nn('yolov6n_coco_640x640', input=color)

    oak.sync([color.out.main, yolo.out.main], callback=callback)

Encoded streams

Luxonis devices support on-device encoding of the outputs to H.264, H.265 and MJPEG formats. To enable encoding, we should simply pass the encode argument to the OakCamera.create_camera() or OakCamera.create_stereo() methods. Possible values for the encode argument are h264, h265 and mjpeg.

Each component has its own encoded output:

  • CameraComponent.Out.encoded

  • StereoComponent.Out.encoded

  • NNComponent.Out.encoded

Here is an example which visualizes the encoded color, YOLO neural network and disparity streams:

from depthai_sdk import OakCamera

with OakCamera() as oak:
    color = oak.create_camera('color', resolution='1080p', fps=20, encode='h264')
    stereo = oak.create_stereo('400p', encode='h264')
    yolo = oak.create_nn('yolov6nr3_coco_640x352', input=color)

    oak.visualize([color.out.encoded, stereo.out.encoded, yolo.out.encoded])

Got questions?

Head over to Discussion Forum for technical support or any other questions you might have.