Many Devices, One API’ and the OpenVINO Toolkit

Vision as an input is everywhere—and with many accelerators available to assist us. Now there is an optimized toolkit from Intel to span the hardware with a single API, and it includes a library of functions, pre-optimized kernels, and optimized calls for OpenCV and OpenVX.

Intel calls it the Open Visual Inference & Neural Network Optimization (OpenVINO™) toolkit. It’s free and open source.

The most recent release even supports a Raspberry Pi as a host to use an Intel® Neural Compute Stick 2. The Intel Neural Compute Stick 2 is powered by the Intel™ Movidius™ X VPU to deliver incredible performance at very low power: 4 trillion operations per second, and it plugs into a USB port for communication and power.

The new toolkit offers key functionality to help develop applications that emulate human vision, making them fast by offering optimized support across computer vision accelerators (CPU, GPU, Intel Movidius Neural Compute Stick, and FPGAs) and making them easier to harness heterogeneous computing by offering a common API.

Meet the OpenVINO toolkit

The OpenVINO toolkit offers us a single toolkit for applications wanting human-like vision capabilities. It does this by supporting deep learning, computer vision, and hardware acceleration with heterogeneous support.

When Intel renamed its Computer Vision SDK as the OpenVINO toolkit last year, it added a lot of new support and optimized routines. Included in the toolkit are three new APIs: The Deep Learning Deployment toolkit, a common deep learning inference toolkit, and optimized functions for OpenCV and OpenVX (with support for the ONNX, TensorFlow, MXNet, and Caffe frameworks).

The toolkit is aimed at data scientists and software developers working on computer vision, neural network inference, and deep learning deployments who want to accelerate their solutions across multiple platforms. This should “help developers bring vision intelligence into their applications from edge to cloud,” according to Intel.

Magic in the “Model Optimizer”

The Deep Learning Deployment toolkit includes a feature called the “Model Optimizer.” This is the key to how Intel supports “one API” with high performance.

The Model Optimizer imports trained models from various frameworks (Caffe, Tensorflow, MxNet, ONNX, Kaldi) and converts them to a unified intermediate representation file. It also optimizes topologies through node merging, horizontal fusion, elimination of batch normalization, and quantization. It also supports graph freeze and graph summarize along with dynamic input freezing. Substantial performance boosts occur because the Model Optimizer will utilize the data types that best match the target hardware (e.g., convert FP32 to FP16).

There is an impressive assortment of over two dozen samples, including standard and pipelined image classification, image segmentation, object detection, object detection for Single Shot Multibox Detector (SSD), neural style transfer, security barrier, interactive face detection, people counting, and multi-channel face detection.