NNComponent

NNComponent abstracts sourcing & decoding AI models, creating a DepthAI API node for neural inferencing, object tracking, and MultiStage pipelines setup. It also supports Roboflow integration.

DepthAI API nodes

For neural inference, NNComponent will use DepthAI API node:

If tracker argument is set and we have YOLO/MobileNet-SSD based model, this component will also create ObjectTracker node, and connect the two nodes togeter.

Usage

from depthai_sdk import OakCamera, ResizeMode

with OakCamera(recording='cars-tracking-above-01') as oak:
    color = oak.create_camera('color')
    nn = oak.create_nn('vehicle-detection-0202', color, tracker=True)
    nn.config_nn(resize_mode='stretch')

    oak.visualize([nn.out.tracker, nn.out.passthrough], fps=True)
    oak.start(blocking=True)

Component outputs

  • main - Default output. Streams NN results and high-res frames that were downscaled and used for inferencing. Produces DetectionPacket or TwoStagePacket (if it’s 2. stage NNComponent).

  • passthrough - Default output. Streams NN results and passthrough frames (frames used for inferencing). Produces DetectionPacket or TwoStagePacket (if it’s 2. stage NNComponent).

  • spatials - Streams depth and bounding box mappings (SpatialDetectionNework.boundingBoxMapping). Produces SpatialBbMappingPacket.

  • twostage_crops - Streams 2. stage cropped frames to the host. Produces FramePacket.

  • tracker - Streams ObjectTracker’s tracklets and high-res frames that were downscaled and used for inferencing. Produces TrackerPacket.

  • nn_data - Streams NN raw output. Produces NNDataPacket.

Decoding outputs

NNComponent allows user to define their own decoding functions. There is a set of standardized outputs:

Note

This feature is still in development and is not guaranteed to work correctly in all cases.

Example usage:

import numpy as np
from depthai import NNData

from depthai_sdk import OakCamera
from depthai_sdk.classes import Detections

def decode(nn_data: NNData):
    layer = nn_data.getFirstLayerFp16()
    results = np.array(layer).reshape((1, 1, -1, 7))
    dets = Detections(nn_data)

    for result in results[0][0]:
        if result[2] > 0.5:
            dets.add(result[1], result[2], result[3:])

    return dets


def callback(packet: DetectionPacket, visualizer: Visualizer):
    detections: Detections = packet.img_detections
    ...


with OakCamera() as oak:
    color = oak.create_camera('color')

    nn = oak.create_nn(..., color, decode_fn=decode)

    oak.visualize(nn, callback=callback)
    oak.start(blocking=True)

Reference

General (standarized) NN outputs, to be used for higher-level abstractions (eg. automatic visualization of results). “SDK supported NN models” will have to have standard NN output, so either dai.ImgDetections, or one of the outputs below. If the latter, model json config will incldue handler.py logic for decoding to the standard NN output. These will be integrated into depthai-core, bonus points for on-device decoding of some popular models.

class depthai_sdk.classes.nn_results.Detection(img_detection: Union[NoneType, depthai.ImgDetection, depthai.SpatialImgDetection], label_str: str, confidence: float, color: Tuple[int, int, int], bbox: depthai_sdk.visualize.bbox.BoundingBox, angle: Union[int, NoneType], ts: Union[datetime.timedelta, NoneType])
img_detection: Union[None, depthai.ImgDetection, depthai.SpatialImgDetection]
label_str: str
confidence: float
color: Tuple[int, int, int]
bbox: depthai_sdk.visualize.bbox.BoundingBox
angle: Optional[int]
ts: Optional[datetime.timedelta]
property top_left
property bottom_right
class depthai_sdk.classes.nn_results.TrackingDetection(img_detection: Union[NoneType, depthai.ImgDetection, depthai.SpatialImgDetection], label_str: str, confidence: float, color: Tuple[int, int, int], bbox: depthai_sdk.visualize.bbox.BoundingBox, angle: Union[int, NoneType], ts: Union[datetime.timedelta, NoneType], tracklet: depthai.Tracklet, filtered_2d: depthai_sdk.visualize.bbox.BoundingBox, filtered_3d: depthai.Point3f, speed: Union[float, NoneType])
tracklet: depthai.Tracklet
filtered_2d: depthai_sdk.visualize.bbox.BoundingBox
filtered_3d: depthai.Point3f
speed: Optional[float]
property speed_kmph
property speed_mph
class depthai_sdk.classes.nn_results.TwoStageDetection(img_detection: Union[NoneType, depthai.ImgDetection, depthai.SpatialImgDetection], label_str: str, confidence: float, color: Tuple[int, int, int], bbox: depthai_sdk.visualize.bbox.BoundingBox, angle: Union[int, NoneType], ts: Union[datetime.timedelta, NoneType], nn_data: depthai.NNData)
nn_data: depthai.NNData
class depthai_sdk.classes.nn_results.GenericNNOutput(nn_data)

Generic NN output, to be used for higher-level abstractions (eg. automatic visualization of results).

getTimestamp()
Return type

datetime.timedelta

getSequenceNum()
Return type

int

class depthai_sdk.classes.nn_results.ExtendedImgDetection(angle: int)
class depthai_sdk.classes.nn_results.Detections(nn_data, is_rotated=False)

Detection results containing bounding boxes, labels and confidences. Optionally can contain rotation angles.

class depthai_sdk.classes.nn_results.SemanticSegmentation(nn_data, mask)

Semantic segmentation results, with a mask for each class.

Examples: DeeplabV3, Lanenet, road-segmentation-adas-0001.

mask: List[numpy.ndarray]
class depthai_sdk.classes.nn_results.ImgLandmarks(nn_data, landmarks=None, landmarks_indices=None, pairs=None, colors=None)

Landmarks results, with a list of landmarks and pairs of landmarks to draw lines between.

Examples: human-pose-estimation-0001, openpose2, facial-landmarks-68, landmarks-regression-retail-0009.

class depthai_sdk.classes.nn_results.InstanceSegmentation(nn_data, masks, labels)

Instance segmentation results, with a mask for each instance.

masks: List[numpy.ndarray]
labels: List[int]