Converting model to MyriadX blob¶
To allow DepthAI to use your custom trained models, you need to convert them into a MyriadX blob file format - so that they are optimized for the best inference on MyriadX VPU processor.
There are two conversion steps that have to be taken in order to obtain a blob file:
Use Model Optimizer to produce OpenVINO IR representation (where IR stands for Intermediate Representation)
Use Compile Tool to compile IR representation model into VPU blob
Model Optimizer¶
OpenVINO’s Model optimizer converts the model from the
original framework format into the OpenVINO’s Intermediate Representation (IR) standard format (.bin
and .xml
).
This format of the model can be deployed across multiple Intel devices: CPU, GPU, iGPU, VPU (which we are interested in), and FPGA.
Example usage of Model Optimizer with online Blobconverter:
--data_type=FP16 --mean_values=[0,0,0] --scale_values=[255,255,255]
Example for local conversion:
mo --input_model path/to/model.onnx --data_type=FP16 --mean_values=[0,0,0] --scale_values=[255,255,255]
All arguments below are also documented on OpenVINO’s docs here.
FP16 Data Type¶
Since we are converting for VPU (which supports FP16), we need to use parameter --data_type=FP16
.
More information here.
Mean and Scale parameters¶
OpenVINO’s documentation here.
–mean_values and –scale_values parameters will normalize the input image to the model: new_value = (byte - mean) / scale
.
By default, frames from ColorCamera/MonoCamera
are in U8 data type ([0,255]
).
Models are usually trained with normalized frames [-1,1]
or [0,1]
, so we need to normalize frames before running the inference.
One (not ideal) option is to create Custom model that normalizes frames before inferencing (example here),
but it’s better (more optimized) to do it in the model itself.
Common options:
[0,1] values, mean=0 and scale=255 (
([0,255] - 0) / 255 = [0,1]
)[-1,1] values, mean=127.5 and scale=127.5 (
([0,255] - 127.5) / 127.5 = [-1,1]
)[-0.5,0.5] values, mean=127.5 and scale=255 (
([0,255] - 127.5) / 255 = [-0.5,0.5]
)
Model layout parameter¶
OpenVINO’s documentation here.
Model layout can be specified with --layout
parameter. We use Planar / CHW layout convention. A similar DepthAI error message will be shown if the
image layout is not matching the model layout:
[NeuralNetwork(0)] [warning] Input image (416x416) does not match NN (3x416)
Note that by default, ColorCamera node
will output preview
frames in Interleaved / HWC layout (as it’s native to OpenCV), and can be changed to
Planar layout via API:
import depthai as dai
pipeline = dai.Pipeline()
colorCam = pipeline.createColorCamera()
colorCam.setInterleaved(False) # False = Planar layout
Color order¶
OpenVINO’s documentation here.
NN models can be trained on images that have either RGB or BGR color order. You can change from one
to another using --reverse_input_channels
parameter. We use BGR color order. For example, see Changing color order>.
Note that by default, ColorCamera node
will output preview
frames in BGR color order (as it’s native to OpenCV), and can be changed to
RGB color order via API:
import depthai as dai
pipeline = dai.Pipeline()
colorCam = pipeline.createColorCamera()
colorCam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB) # RGB color order, BGR by default
Compile Tool¶
After converting the model to OpenVINO’s IR format (.bin/.xml), we need to use OpenVINO’s Compile Tool
to compile the model in IR format into .blob
file, which can be deployed to the device (tutorial here)
Input layer precision: RVC2 only supports FP16 precision, so -ip U8
will add conversion layer U8->FP16
on all input layers of the model - which is what we usually want. In some cases (eg. when we aren’t dealing with frames),
we want to use FP16 precision directly, so we can use -ip FP16
(Cosine distance model example).
Shaves: RVC2 has a total has 16 SHAVE cores (see Hardware accelerators documentation).
Compiling for more SHAVEs can make the model perform faster, but the proportion of shave cores isn’t linear with performance.
Firmware will warn you about a possibly optimal number of shave cores, which is available_cores/2
. As by default,
each model will run on 2 threads.
Converting and compiling models¶
There are a few options to perform these steps:
Using our online blobconverter app
Using our blobconverter library
1. Using online blobconverter¶
You can visit our online Blobconverter app which allows you to convert and compile the NN model from TensorFlow, Caffe, ONNX, OpenVINO IR, and OpenVINO Model Zoo.
2. Using blobconverter package¶
For automated usage of our blobconverter tool, we have released a blobconverter PyPi package, that allows converting & compiling models both from the command line and from the Python script directly. Example usage below.
Install and usage instructions can be found here
import blobconverter
blob_path = blobconverter.from_onnx(
model="/path/to/model.onnx",
data_type="FP16",
shaves=5,
)
3. Local compilation¶
If you want to perform model conversion and compilation locally, you can follow:
Troubleshooting¶
When converting your model to the OpenVINO format or compiling it to a .blob
, you might come across an issue. This usually
means that a connection between two layers is not supported or that the layer is not supported.
For visualization of NN models we suggest using Netron app.
Supported layers¶
When converting your model to OpenVINO’s IR format (.bin
and .xml
), you have to check if the OpenVINO supports layers
that were used. Here are supported layers and their limitations for
Caffee,
MXNet,
TensorFlow,
TensorFlow 2 Keras,
Kaldi,
and ONNX.
Unsupported layer type “layer_type”¶
When using compile_tool to compile from IR (.xml/.bin) into .blob, you might get an error like this:
Failed to compile layer "Resize_230": unsupported layer type "Interpolate"
This means that the layer type is not supported by the VPU (Intels Myriad X). You can find supported OpenVINO layers by the VPU here, under the Supported Layers header, in the third column (VPU). Refer to official Intel’s troubleshooting docs for more information.
Incorrect data types¶
If the compiler returns something along the lines of “check error: input #0 has type S32, but one of [FP16] is expected”,
it means that you are using incorrect data types. In the case above, an INT32 layer is connected to FP16 directly.
There should be a conversion in between these layers, and we can achieve that by using the OpenVINOs Convert
layer between these two layers. You can do that by editing your models .xml
and adding the Convert
layer. You can find additional information on this discord thread.