YOLO/PASCAL-VOC detection tutorial

This tutorial demonstrates that Akida can perform object detection using a state-of-the-art model architecture. This is illustrated using a subset of the PASCAL-VOC 2007 dataset with “car” and “person” classes only. The YOLOv2 architecture from Redmon et al (2016) has been chosen to tackle this object detection problem.

1. Introduction

1.1 Object detection

Object detection is a computer vision task that combines two elemental tasks:

  • object classification that consists in assigning a class label to an image like shown in the AkidaNet/ImageNet inference example

  • object localization that consists in drawing a bounding box around one or several objects in an image

One can learn more about the subject reading this introduction to object detection blog article.

1.2 YOLO key concepts

You Only Look Once (YOLO) is a deep neural network architecture dedicated to object detection.

As opposed to classic networks that handle object detection, YOLO predicts bounding boxes (localization task) and class probabilities (classification task) from a single neural network in a single evaluation. The object detection task is reduced to a regression problem to spatially separated boxes and associated class probabilities.

YOLO base concept is to divide an input image into regions, forming a grid, and to predict bounding boxes and probabilities for each region. The bounding boxes are weighted by the prediction probabilities.

YOLO also uses the concept of “anchors boxes” or “prior boxes”. The network does not actually predict the actual bounding boxes but offsets from anchors boxes which are templates (width/height ratio) computed by clustering the dimensions of the ground truth boxes from the training dataset. The anchors then represent the average shape and size of the objects to detect. More details on the anchors boxes concept are given in this blog article.

Additional information about YOLO can be found on the Darknet website and source code for the preprocessing and postprocessing functions that are included in akida_models package (see the processing section in the model zoo) is largely inspired from experiencor github.

2. Preprocessing tools

As this example focuses on car and person detection only, a subset of VOC has been prepared with test images from VOC2007 that contains at least one of the occurence of the two classes. Just like the VOC dataset, the subset contains an image folder, an annotation folder and a text file listing the file names of interest.

The YOLO toolkit offers several methods to prepare data for processing, see load_image, preprocess_image or parse_voc_annotations.

import os

from tensorflow.keras.utils import get_file
from akida_models.detection.processing import parse_voc_annotations

# Download validation set from Brainchip data server
data_path = get_file(
    "voc_test_car_person.tar.gz",
    "http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz",
    cache_subdir='datasets/voc',
    extract=True)

data_dir = os.path.dirname(data_path)
gt_folder = os.path.join(data_dir, 'voc_test_car_person', 'Annotations')
image_folder = os.path.join(data_dir, 'voc_test_car_person', 'JPEGImages')
file_path = os.path.join(
    data_dir, 'voc_test_car_person', 'test_car_person.txt')
labels = ['car', 'person']

val_data = parse_voc_annotations(gt_folder, image_folder, file_path, labels)
print("Loaded VOC2007 test data for car and person classes: "
      f"{len(val_data)} images.")

Out:

Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz

    16384/221551911 [..............................] - ETA: 3:18
   286720/221551911 [..............................] - ETA: 50s 
   811008/221551911 [..............................] - ETA: 31s
  1376256/221551911 [..............................] - ETA: 26s
  1966080/221551911 [..............................] - ETA: 24s
  2547712/221551911 [..............................] - ETA: 24s
  2990080/221551911 [..............................] - ETA: 25s
  3432448/221551911 [..............................] - ETA: 25s
  3743744/221551911 [..............................] - ETA: 26s
  4071424/221551911 [..............................] - ETA: 26s
  4390912/221551911 [..............................] - ETA: 27s
  4734976/221551911 [..............................] - ETA: 27s
  5079040/221551911 [..............................] - ETA: 27s
  5431296/221551911 [..............................] - ETA: 28s
  5799936/221551911 [..............................] - ETA: 28s
  6168576/221551911 [..............................] - ETA: 28s
  6561792/221551911 [..............................] - ETA: 28s
  6946816/221551911 [..............................] - ETA: 28s
  7340032/221551911 [..............................] - ETA: 27s
  7749632/221551911 [>.............................] - ETA: 27s
  8142848/221551911 [>.............................] - ETA: 27s
  8577024/221551911 [>.............................] - ETA: 27s
  9003008/221551911 [>.............................] - ETA: 27s
  9445376/221551911 [>.............................] - ETA: 27s
  9879552/221551911 [>.............................] - ETA: 27s
 10338304/221551911 [>.............................] - ETA: 26s
 10797056/221551911 [>.............................] - ETA: 26s
 11264000/221551911 [>.............................] - ETA: 26s
 11739136/221551911 [>.............................] - ETA: 26s
 12230656/221551911 [>.............................] - ETA: 25s
 12713984/221551911 [>.............................] - ETA: 25s
 13172736/221551911 [>.............................] - ETA: 25s
 13672448/221551911 [>.............................] - ETA: 25s
 14196736/221551911 [>.............................] - ETA: 25s
 14688256/221551911 [>.............................] - ETA: 24s
 15220736/221551911 [=>............................] - ETA: 24s
 15761408/221551911 [=>............................] - ETA: 24s
 16293888/221551911 [=>............................] - ETA: 24s
 16850944/221551911 [=>............................] - ETA: 23s
 17408000/221551911 [=>............................] - ETA: 23s
 17981440/221551911 [=>............................] - ETA: 23s
 18554880/221551911 [=>............................] - ETA: 23s
 19144704/221551911 [=>............................] - ETA: 22s
 19734528/221551911 [=>............................] - ETA: 22s
 20324352/221551911 [=>............................] - ETA: 22s
 20914176/221551911 [=>............................] - ETA: 22s
 21504000/221551911 [=>............................] - ETA: 22s
 22093824/221551911 [=>............................] - ETA: 21s
 22667264/221551911 [==>...........................] - ETA: 21s
 23257088/221551911 [==>...........................] - ETA: 21s
 23846912/221551911 [==>...........................] - ETA: 21s
 24436736/221551911 [==>...........................] - ETA: 21s
 25026560/221551911 [==>...........................] - ETA: 20s
 25616384/221551911 [==>...........................] - ETA: 20s
 26206208/221551911 [==>...........................] - ETA: 20s
 26804224/221551911 [==>...........................] - ETA: 20s
 27394048/221551911 [==>...........................] - ETA: 20s
 27983872/221551911 [==>...........................] - ETA: 20s
 28573696/221551911 [==>...........................] - ETA: 20s
 29155328/221551911 [==>...........................] - ETA: 19s
 29753344/221551911 [===>..........................] - ETA: 19s
 30343168/221551911 [===>..........................] - ETA: 19s
 30932992/221551911 [===>..........................] - ETA: 19s
 31514624/221551911 [===>..........................] - ETA: 19s
 31703040/221551911 [===>..........................] - ETA: 19s
 32260096/221551911 [===>..........................] - ETA: 19s
 32792576/221551911 [===>..........................] - ETA: 19s
 33316864/221551911 [===>..........................] - ETA: 19s
 33865728/221551911 [===>..........................] - ETA: 19s
 34406400/221551911 [===>..........................] - ETA: 19s
 34955264/221551911 [===>..........................] - ETA: 19s
 35495936/221551911 [===>..........................] - ETA: 19s
 36061184/221551911 [===>..........................] - ETA: 18s
 36626432/221551911 [===>..........................] - ETA: 18s
 37183488/221551911 [====>.........................] - ETA: 18s
 37756928/221551911 [====>.........................] - ETA: 18s
 38330368/221551911 [====>.........................] - ETA: 18s
 38895616/221551911 [====>.........................] - ETA: 18s
 39477248/221551911 [====>.........................] - ETA: 18s
 40042496/221551911 [====>.........................] - ETA: 18s
 40624128/221551911 [====>.........................] - ETA: 18s
 41205760/221551911 [====>.........................] - ETA: 18s
 41803776/221551911 [====>.........................] - ETA: 17s
 42377216/221551911 [====>.........................] - ETA: 17s
 42967040/221551911 [====>.........................] - ETA: 17s
 43524096/221551911 [====>.........................] - ETA: 17s
 43941888/221551911 [====>.........................] - ETA: 17s
 44376064/221551911 [=====>........................] - ETA: 17s
 44736512/221551911 [=====>........................] - ETA: 17s
 45096960/221551911 [=====>........................] - ETA: 17s
 45465600/221551911 [=====>........................] - ETA: 17s
 45842432/221551911 [=====>........................] - ETA: 17s
 46219264/221551911 [=====>........................] - ETA: 17s
 46620672/221551911 [=====>........................] - ETA: 17s
 47022080/221551911 [=====>........................] - ETA: 17s
 47439872/221551911 [=====>........................] - ETA: 17s
 47857664/221551911 [=====>........................] - ETA: 17s
 48275456/221551911 [=====>........................] - ETA: 17s
 48701440/221551911 [=====>........................] - ETA: 17s
 49143808/221551911 [=====>........................] - ETA: 17s
 49586176/221551911 [=====>........................] - ETA: 17s
 50036736/221551911 [=====>........................] - ETA: 17s
 50503680/221551911 [=====>........................] - ETA: 17s
 50962432/221551911 [=====>........................] - ETA: 17s
 51445760/221551911 [=====>........................] - ETA: 17s
 51929088/221551911 [======>.......................] - ETA: 17s
 52428800/221551911 [======>.......................] - ETA: 17s
 52928512/221551911 [======>.......................] - ETA: 17s
 53436416/221551911 [======>.......................] - ETA: 17s
 53952512/221551911 [======>.......................] - ETA: 17s
 54484992/221551911 [======>.......................] - ETA: 17s
 55017472/221551911 [======>.......................] - ETA: 17s
 55558144/221551911 [======>.......................] - ETA: 17s
 56090624/221551911 [======>.......................] - ETA: 17s
 56639488/221551911 [======>.......................] - ETA: 16s
 57196544/221551911 [======>.......................] - ETA: 16s
 57761792/221551911 [======>.......................] - ETA: 16s
 58318848/221551911 [======>.......................] - ETA: 16s
 58892288/221551911 [======>.......................] - ETA: 16s
 59473920/221551911 [=======>......................] - ETA: 16s
 60047360/221551911 [=======>......................] - ETA: 16s
 60628992/221551911 [=======>......................] - ETA: 16s
 61210624/221551911 [=======>......................] - ETA: 16s
 61800448/221551911 [=======>......................] - ETA: 16s
 62398464/221551911 [=======>......................] - ETA: 16s
 62988288/221551911 [=======>......................] - ETA: 16s
 63488000/221551911 [=======>......................] - ETA: 16s
 63922176/221551911 [=======>......................] - ETA: 15s
 64389120/221551911 [=======>......................] - ETA: 15s
 64864256/221551911 [=======>......................] - ETA: 15s
 65355776/221551911 [=======>......................] - ETA: 15s
 65847296/221551911 [=======>......................] - ETA: 15s
 66347008/221551911 [=======>......................] - ETA: 15s
 66846720/221551911 [========>.....................] - ETA: 15s
 67346432/221551911 [========>.....................] - ETA: 15s
 67862528/221551911 [========>.....................] - ETA: 15s
 68386816/221551911 [========>.....................] - ETA: 15s
 68911104/221551911 [========>.....................] - ETA: 15s
 69451776/221551911 [========>.....................] - ETA: 15s
 69992448/221551911 [========>.....................] - ETA: 15s
 70541312/221551911 [========>.....................] - ETA: 15s
 71024640/221551911 [========>.....................] - ETA: 15s
 71557120/221551911 [========>.....................] - ETA: 15s
 72015872/221551911 [========>.....................] - ETA: 15s
 72531968/221551911 [========>.....................] - ETA: 15s
 73105408/221551911 [========>.....................] - ETA: 15s
 73670656/221551911 [========>.....................] - ETA: 14s
 73998336/221551911 [=========>....................] - ETA: 14s
 74473472/221551911 [=========>....................] - ETA: 14s
 74915840/221551911 [=========>....................] - ETA: 14s
 75350016/221551911 [=========>....................] - ETA: 14s
 75776000/221551911 [=========>....................] - ETA: 14s
 76201984/221551911 [=========>....................] - ETA: 14s
 76636160/221551911 [=========>....................] - ETA: 14s
 77119488/221551911 [=========>....................] - ETA: 14s
 77586432/221551911 [=========>....................] - ETA: 14s
 78061568/221551911 [=========>....................] - ETA: 14s
 78561280/221551911 [=========>....................] - ETA: 14s
 79060992/221551911 [=========>....................] - ETA: 14s
 79577088/221551911 [=========>....................] - ETA: 14s
 80076800/221551911 [=========>....................] - ETA: 14s
 80592896/221551911 [=========>....................] - ETA: 14s
 81133568/221551911 [=========>....................] - ETA: 14s
 81674240/221551911 [==========>...................] - ETA: 14s
 82223104/221551911 [==========>...................] - ETA: 14s
 82780160/221551911 [==========>...................] - ETA: 14s
 83337216/221551911 [==========>...................] - ETA: 14s
 83894272/221551911 [==========>...................] - ETA: 13s
 84467712/221551911 [==========>...................] - ETA: 13s
 85041152/221551911 [==========>...................] - ETA: 13s
 85630976/221551911 [==========>...................] - ETA: 13s
 86212608/221551911 [==========>...................] - ETA: 13s
 86794240/221551911 [==========>...................] - ETA: 13s
 87384064/221551911 [==========>...................] - ETA: 13s
 87982080/221551911 [==========>...................] - ETA: 13s
 88571904/221551911 [==========>...................] - ETA: 13s
 89161728/221551911 [===========>..................] - ETA: 13s
 89718784/221551911 [===========>..................] - ETA: 13s
 90005504/221551911 [===========>..................] - ETA: 13s
 90521600/221551911 [===========>..................] - ETA: 13s
 91004928/221551911 [===========>..................] - ETA: 13s
 91488256/221551911 [===========>..................] - ETA: 13s
 91979776/221551911 [===========>..................] - ETA: 13s
 92471296/221551911 [===========>..................] - ETA: 13s
 92979200/221551911 [===========>..................] - ETA: 12s
 93495296/221551911 [===========>..................] - ETA: 12s
 94019584/221551911 [===========>..................] - ETA: 12s
 94543872/221551911 [===========>..................] - ETA: 12s
 95076352/221551911 [===========>..................] - ETA: 12s
 95608832/221551911 [===========>..................] - ETA: 12s
 96157696/221551911 [============>.................] - ETA: 12s
 96706560/221551911 [============>.................] - ETA: 12s
 97280000/221551911 [============>.................] - ETA: 12s
 97845248/221551911 [============>.................] - ETA: 12s
 98418688/221551911 [============>.................] - ETA: 12s
 99000320/221551911 [============>.................] - ETA: 12s
 99581952/221551911 [============>.................] - ETA: 12s
100163584/221551911 [============>.................] - ETA: 12s
100745216/221551911 [============>.................] - ETA: 12s
101335040/221551911 [============>.................] - ETA: 12s
101924864/221551911 [============>.................] - ETA: 11s
102522880/221551911 [============>.................] - ETA: 11s
103112704/221551911 [============>.................] - ETA: 11s
103702528/221551911 [=============>................] - ETA: 11s
104292352/221551911 [=============>................] - ETA: 11s
104882176/221551911 [=============>................] - ETA: 11s
105480192/221551911 [=============>................] - ETA: 11s
106070016/221551911 [=============>................] - ETA: 11s
106659840/221551911 [=============>................] - ETA: 11s
107257856/221551911 [=============>................] - ETA: 11s
107847680/221551911 [=============>................] - ETA: 11s
108437504/221551911 [=============>................] - ETA: 11s
109027328/221551911 [=============>................] - ETA: 11s
109617152/221551911 [=============>................] - ETA: 11s
110215168/221551911 [=============>................] - ETA: 10s
110813184/221551911 [==============>...............] - ETA: 10s
111403008/221551911 [==============>...............] - ETA: 10s
111992832/221551911 [==============>...............] - ETA: 10s
112590848/221551911 [==============>...............] - ETA: 10s
113180672/221551911 [==============>...............] - ETA: 10s
113770496/221551911 [==============>...............] - ETA: 10s
113975296/221551911 [==============>...............] - ETA: 10s
114556928/221551911 [==============>...............] - ETA: 10s
115097600/221551911 [==============>...............] - ETA: 10s
115621888/221551911 [==============>...............] - ETA: 10s
116170752/221551911 [==============>...............] - ETA: 10s
116736000/221551911 [==============>...............] - ETA: 10s
117301248/221551911 [==============>...............] - ETA: 10s
117874688/221551911 [==============>...............] - ETA: 10s
118448128/221551911 [===============>..............] - ETA: 10s
119029760/221551911 [===============>..............] - ETA: 10s
119627776/221551911 [===============>..............] - ETA: 10s
120217600/221551911 [===============>..............] - ETA: 9s 
120807424/221551911 [===============>..............] - ETA: 9s
121397248/221551911 [===============>..............] - ETA: 9s
121987072/221551911 [===============>..............] - ETA: 9s
122576896/221551911 [===============>..............] - ETA: 9s
123174912/221551911 [===============>..............] - ETA: 9s
123764736/221551911 [===============>..............] - ETA: 9s
124346368/221551911 [===============>..............] - ETA: 9s
124936192/221551911 [===============>..............] - ETA: 9s
125526016/221551911 [===============>..............] - ETA: 9s
126115840/221551911 [================>.............] - ETA: 9s
126697472/221551911 [================>.............] - ETA: 9s
127295488/221551911 [================>.............] - ETA: 9s
127885312/221551911 [================>.............] - ETA: 9s
128475136/221551911 [================>.............] - ETA: 9s
129064960/221551911 [================>.............] - ETA: 8s
129662976/221551911 [================>.............] - ETA: 8s
130252800/221551911 [================>.............] - ETA: 8s
130842624/221551911 [================>.............] - ETA: 8s
131432448/221551911 [================>.............] - ETA: 8s
132022272/221551911 [================>.............] - ETA: 8s
132521984/221551911 [================>.............] - ETA: 8s
132874240/221551911 [================>.............] - ETA: 8s
133136384/221551911 [=================>............] - ETA: 8s
133545984/221551911 [=================>............] - ETA: 8s
133939200/221551911 [=================>............] - ETA: 8s
134348800/221551911 [=================>............] - ETA: 8s
134766592/221551911 [=================>............] - ETA: 8s
135184384/221551911 [=================>............] - ETA: 8s
135618560/221551911 [=================>............] - ETA: 8s
136052736/221551911 [=================>............] - ETA: 8s
136503296/221551911 [=================>............] - ETA: 8s
136953856/221551911 [=================>............] - ETA: 8s
137420800/221551911 [=================>............] - ETA: 8s
137887744/221551911 [=================>............] - ETA: 8s
138362880/221551911 [=================>............] - ETA: 8s
138854400/221551911 [=================>............] - ETA: 8s
139345920/221551911 [=================>............] - ETA: 8s
139845632/221551911 [=================>............] - ETA: 8s
140353536/221551911 [==================>...........] - ETA: 7s
140877824/221551911 [==================>...........] - ETA: 7s
141393920/221551911 [==================>...........] - ETA: 7s
141926400/221551911 [==================>...........] - ETA: 7s
142442496/221551911 [==================>...........] - ETA: 7s
142942208/221551911 [==================>...........] - ETA: 7s
143482880/221551911 [==================>...........] - ETA: 7s
144023552/221551911 [==================>...........] - ETA: 7s
144572416/221551911 [==================>...........] - ETA: 7s
145121280/221551911 [==================>...........] - ETA: 7s
145686528/221551911 [==================>...........] - ETA: 7s
146259968/221551911 [==================>...........] - ETA: 7s
146833408/221551911 [==================>...........] - ETA: 7s
147415040/221551911 [==================>...........] - ETA: 7s
148004864/221551911 [===================>..........] - ETA: 7s
148594688/221551911 [===================>..........] - ETA: 7s
149184512/221551911 [===================>..........] - ETA: 7s
149782528/221551911 [===================>..........] - ETA: 7s
150372352/221551911 [===================>..........] - ETA: 6s
150970368/221551911 [===================>..........] - ETA: 6s
151560192/221551911 [===================>..........] - ETA: 6s
152150016/221551911 [===================>..........] - ETA: 6s
152690688/221551911 [===================>..........] - ETA: 6s
153034752/221551911 [===================>..........] - ETA: 6s
153526272/221551911 [===================>..........] - ETA: 6s
154017792/221551911 [===================>..........] - ETA: 6s
154509312/221551911 [===================>..........] - ETA: 6s
154992640/221551911 [===================>..........] - ETA: 6s
155508736/221551911 [====================>.........] - ETA: 6s
156016640/221551911 [====================>.........] - ETA: 6s
156540928/221551911 [====================>.........] - ETA: 6s
156958720/221551911 [====================>.........] - ETA: 6s
157327360/221551911 [====================>.........] - ETA: 6s
157720576/221551911 [====================>.........] - ETA: 6s
158121984/221551911 [====================>.........] - ETA: 6s
158523392/221551911 [====================>.........] - ETA: 6s
158941184/221551911 [====================>.........] - ETA: 6s
159367168/221551911 [====================>.........] - ETA: 6s
159801344/221551911 [====================>.........] - ETA: 6s
160227328/221551911 [====================>.........] - ETA: 6s
160677888/221551911 [====================>.........] - ETA: 5s
161120256/221551911 [====================>.........] - ETA: 5s
161579008/221551911 [====================>.........] - ETA: 5s
162054144/221551911 [====================>.........] - ETA: 5s
162521088/221551911 [=====================>........] - ETA: 5s
163004416/221551911 [=====================>........] - ETA: 5s
163495936/221551911 [=====================>........] - ETA: 5s
163979264/221551911 [=====================>........] - ETA: 5s
164487168/221551911 [=====================>........] - ETA: 5s
165003264/221551911 [=====================>........] - ETA: 5s
165519360/221551911 [=====================>........] - ETA: 5s
166051840/221551911 [=====================>........] - ETA: 5s
166551552/221551911 [=====================>........] - ETA: 5s
167084032/221551911 [=====================>........] - ETA: 5s
167624704/221551911 [=====================>........] - ETA: 5s
168181760/221551911 [=====================>........] - ETA: 5s
168747008/221551911 [=====================>........] - ETA: 5s
169320448/221551911 [=====================>........] - ETA: 5s
169893888/221551911 [======================>.......] - ETA: 5s
170475520/221551911 [======================>.......] - ETA: 5s
171057152/221551911 [======================>.......] - ETA: 4s
171638784/221551911 [======================>.......] - ETA: 4s
172236800/221551911 [======================>.......] - ETA: 4s
172826624/221551911 [======================>.......] - ETA: 4s
173416448/221551911 [======================>.......] - ETA: 4s
174006272/221551911 [======================>.......] - ETA: 4s
174596096/221551911 [======================>.......] - ETA: 4s
175185920/221551911 [======================>.......] - ETA: 4s
175775744/221551911 [======================>.......] - ETA: 4s
176365568/221551911 [======================>.......] - ETA: 4s
176955392/221551911 [======================>.......] - ETA: 4s
177545216/221551911 [=======================>......] - ETA: 4s
178135040/221551911 [=======================>......] - ETA: 4s
178733056/221551911 [=======================>......] - ETA: 4s
179331072/221551911 [=======================>......] - ETA: 4s
179929088/221551911 [=======================>......] - ETA: 4s
180518912/221551911 [=======================>......] - ETA: 4s
181108736/221551911 [=======================>......] - ETA: 3s
181698560/221551911 [=======================>......] - ETA: 3s
182288384/221551911 [=======================>......] - ETA: 3s
182878208/221551911 [=======================>......] - ETA: 3s
183181312/221551911 [=======================>......] - ETA: 3s
183689216/221551911 [=======================>......] - ETA: 3s
184147968/221551911 [=======================>......] - ETA: 3s
184623104/221551911 [=======================>......] - ETA: 3s
185090048/221551911 [========================>.....] - ETA: 3s
185565184/221551911 [========================>.....] - ETA: 3s
186064896/221551911 [========================>.....] - ETA: 3s
186548224/221551911 [========================>.....] - ETA: 3s
187047936/221551911 [========================>.....] - ETA: 3s
187564032/221551911 [========================>.....] - ETA: 3s
188047360/221551911 [========================>.....] - ETA: 3s
188571648/221551911 [========================>.....] - ETA: 3s
189071360/221551911 [========================>.....] - ETA: 3s
189587456/221551911 [========================>.....] - ETA: 3s
190103552/221551911 [========================>.....] - ETA: 3s
190521344/221551911 [========================>.....] - ETA: 3s
190906368/221551911 [========================>.....] - ETA: 3s
191340544/221551911 [========================>.....] - ETA: 2s
191766528/221551911 [========================>.....] - ETA: 2s
192208896/221551911 [=========================>....] - ETA: 2s
192659456/221551911 [=========================>....] - ETA: 2s
193011712/221551911 [=========================>....] - ETA: 2s
193355776/221551911 [=========================>....] - ETA: 2s
193683456/221551911 [=========================>....] - ETA: 2s
194019328/221551911 [=========================>....] - ETA: 2s
194379776/221551911 [=========================>....] - ETA: 2s
194732032/221551911 [=========================>....] - ETA: 2s
195100672/221551911 [=========================>....] - ETA: 2s
195477504/221551911 [=========================>....] - ETA: 2s
195731456/221551911 [=========================>....] - ETA: 2s
196034560/221551911 [=========================>....] - ETA: 2s
196321280/221551911 [=========================>....] - ETA: 2s
196616192/221551911 [=========================>....] - ETA: 2s
196935680/221551911 [=========================>....] - ETA: 2s
197230592/221551911 [=========================>....] - ETA: 2s
197550080/221551911 [=========================>....] - ETA: 2s
197877760/221551911 [=========================>....] - ETA: 2s
198221824/221551911 [=========================>....] - ETA: 2s
198549504/221551911 [=========================>....] - ETA: 2s
198828032/221551911 [=========================>....] - ETA: 2s
199147520/221551911 [=========================>....] - ETA: 2s
199393280/221551911 [=========================>....] - ETA: 2s
199680000/221551911 [==========================>...] - ETA: 2s
199958528/221551911 [==========================>...] - ETA: 2s
200245248/221551911 [==========================>...] - ETA: 2s
200548352/221551911 [==========================>...] - ETA: 2s
200843264/221551911 [==========================>...] - ETA: 2s
201154560/221551911 [==========================>...] - ETA: 2s
201465856/221551911 [==========================>...] - ETA: 2s
201801728/221551911 [==========================>...] - ETA: 1s
202129408/221551911 [==========================>...] - ETA: 1s
202489856/221551911 [==========================>...] - ETA: 1s
202850304/221551911 [==========================>...] - ETA: 1s
203218944/221551911 [==========================>...] - ETA: 1s
203571200/221551911 [==========================>...] - ETA: 1s
203964416/221551911 [==========================>...] - ETA: 1s
204349440/221551911 [==========================>...] - ETA: 1s
204750848/221551911 [==========================>...] - ETA: 1s
205152256/221551911 [==========================>...] - ETA: 1s
205570048/221551911 [==========================>...] - ETA: 1s
205996032/221551911 [==========================>...] - ETA: 1s
206430208/221551911 [==========================>...] - ETA: 1s
206864384/221551911 [===========================>..] - ETA: 1s
207306752/221551911 [===========================>..] - ETA: 1s
207749120/221551911 [===========================>..] - ETA: 1s
208216064/221551911 [===========================>..] - ETA: 1s
208691200/221551911 [===========================>..] - ETA: 1s
209174528/221551911 [===========================>..] - ETA: 1s
209666048/221551911 [===========================>..] - ETA: 1s
210165760/221551911 [===========================>..] - ETA: 1s
210665472/221551911 [===========================>..] - ETA: 1s
211173376/221551911 [===========================>..] - ETA: 1s
211697664/221551911 [===========================>..] - ETA: 1s
212180992/221551911 [===========================>..] - ETA: 0s
212713472/221551911 [===========================>..] - ETA: 0s
213254144/221551911 [===========================>..] - ETA: 0s
213803008/221551911 [===========================>..] - ETA: 0s
214351872/221551911 [============================>.] - ETA: 0s
214917120/221551911 [============================>.] - ETA: 0s
215474176/221551911 [============================>.] - ETA: 0s
216031232/221551911 [============================>.] - ETA: 0s
216588288/221551911 [============================>.] - ETA: 0s
217161728/221551911 [============================>.] - ETA: 0s
217743360/221551911 [============================>.] - ETA: 0s
218333184/221551911 [============================>.] - ETA: 0s
218923008/221551911 [============================>.] - ETA: 0s
219512832/221551911 [============================>.] - ETA: 0s
220102656/221551911 [============================>.] - ETA: 0s
220692480/221551911 [============================>.] - ETA: 0s
221224960/221551911 [============================>.] - ETA: 0s
221257728/221551911 [============================>.] - ETA: 0s
221306880/221551911 [============================>.] - ETA: 0s
221356032/221551911 [============================>.] - ETA: 0s
221405184/221551911 [============================>.] - ETA: 0s
221437952/221551911 [============================>.] - ETA: 0s
221495296/221551911 [============================>.] - ETA: 0s
221544448/221551911 [============================>.] - ETA: 0s
221552640/221551911 [==============================] - 23s 0us/step

221560832/221551911 [==============================] - 23s 0us/step
Loaded VOC2007 test data for car and person classes: 2500 images.

Anchors can also be computed easily using YOLO toolkit.

Note

The following code is given as an example. In a real use case scenario, anchors are computed on the training dataset.

from akida_models.detection.generate_anchors import generate_anchors

num_anchors = 5
grid_size = (7, 7)
anchors_example = generate_anchors(val_data, num_anchors, grid_size)

Out:

Average IOU for 5 anchors: 0.61
Anchors:  [[0.6377, 1.13475], [1.25899, 2.81096], [2.28008, 2.94847], [3.64754, 5.01809], [5.20856, 5.74005]]

3. Model architecture

The model zoo contains a YOLO model that is built upon the AkidaNet architecture and 3 separable convolutional layers at the top for bounding box and class estimation followed by a final separable convolutional which is the detection layer. Note that for efficiency, the alpha parameter in AkidaNet (network width or number of filter in each layer) is set to 0.5.

from akida_models import yolo_base

# Create a yolo model for 2 classes with 5 anchors and grid size of 7
classes = 2

model = yolo_base(input_shape=(224, 224, 3),
                  classes=classes,
                  nb_box=num_anchors,
                  alpha=0.5)
model.summary()

Out:

Model: "yolo_base"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input_28 (InputLayer)       [(None, 224, 224, 3)]     0

 rescaling_1 (Rescaling)     (None, 224, 224, 3)       0

 conv_0 (Conv2D)             (None, 112, 112, 16)      432

 conv_0_BN (BatchNormalizati  (None, 112, 112, 16)     64
 on)

 conv_0_relu (ReLU)          (None, 112, 112, 16)      0

 conv_1 (Conv2D)             (None, 112, 112, 32)      4608

 conv_1_BN (BatchNormalizati  (None, 112, 112, 32)     128
 on)

 conv_1_relu (ReLU)          (None, 112, 112, 32)      0

 conv_2 (Conv2D)             (None, 56, 56, 64)        18432

 conv_2_BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_2_relu (ReLU)          (None, 56, 56, 64)        0

 conv_3 (Conv2D)             (None, 56, 56, 64)        36864

 conv_3_BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_3_relu (ReLU)          (None, 56, 56, 64)        0

 separable_4 (SeparableConv2  (None, 28, 28, 128)      8768
 D)

 separable_4_BN (BatchNormal  (None, 28, 28, 128)      512
 ization)

 separable_4_relu (ReLU)     (None, 28, 28, 128)       0

 separable_5 (SeparableConv2  (None, 28, 28, 128)      17536
 D)

 separable_5_BN (BatchNormal  (None, 28, 28, 128)      512
 ization)

 separable_5_relu (ReLU)     (None, 28, 28, 128)       0

 separable_6 (SeparableConv2  (None, 14, 14, 256)      33920
 D)

 separable_6_BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_6_relu (ReLU)     (None, 14, 14, 256)       0

 separable_7 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_7_BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_7_relu (ReLU)     (None, 14, 14, 256)       0

 separable_8 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_8_BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_8_relu (ReLU)     (None, 14, 14, 256)       0

 separable_9 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_9_BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_9_relu (ReLU)     (None, 14, 14, 256)       0

 separable_10 (SeparableConv  (None, 14, 14, 256)      67840
 2D)

 separable_10_BN (BatchNorma  (None, 14, 14, 256)      1024
 lization)

 separable_10_relu (ReLU)    (None, 14, 14, 256)       0

 separable_11 (SeparableConv  (None, 14, 14, 256)      67840
 2D)

 separable_11_BN (BatchNorma  (None, 14, 14, 256)      1024
 lization)

 separable_11_relu (ReLU)    (None, 14, 14, 256)       0

 separable_12 (SeparableConv  (None, 7, 7, 512)        133376
 2D)

 separable_12_BN (BatchNorma  (None, 7, 7, 512)        2048
 lization)

 separable_12_relu (ReLU)    (None, 7, 7, 512)         0

 separable_13 (SeparableConv  (None, 7, 7, 512)        266752
 2D)

 separable_13_BN (BatchNorma  (None, 7, 7, 512)        2048
 lization)

 separable_13_relu (ReLU)    (None, 7, 7, 512)         0

 1conv (SeparableConv2D)     (None, 7, 7, 1024)        528896

 1conv_BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 1conv_relu (ReLU)           (None, 7, 7, 1024)        0

 2conv (SeparableConv2D)     (None, 7, 7, 1024)        1057792

 2conv_BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 2conv_relu (ReLU)           (None, 7, 7, 1024)        0

 3conv (SeparableConv2D)     (None, 7, 7, 1024)        1057792

 3conv_BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 3conv_relu (ReLU)           (None, 7, 7, 1024)        0

 detection_layer (SeparableC  (None, 7, 7, 35)         45091
 onv2D)

=================================================================
Total params: 3,573,715
Trainable params: 3,561,587
Non-trainable params: 12,128
_________________________________________________________________

The model output can be reshaped to a more natural shape of:

(grid_height, grid_width, anchors_box, 4 + 1 + num_classes)

where the “4 + 1” term represents the coordinates of the estimated bounding boxes (top left x, top left y, width and height) and a confidence score. In other words, the output channels are actually grouped by anchor boxes, and in each group one channel provides either a coordinate, a global confidence score or a class confidence score. This process is done automatically in the decode_output function.

from tensorflow.keras import Model
from tensorflow.keras.layers import Reshape

# Define a reshape output to be added to the YOLO model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model.output)

# Build the complete model
full_model = Model(model.input, output)
full_model.output

Out:

<KerasTensor: shape=(None, 7, 7, 5, 7) dtype=float32 (created by layer 'YOLO_output')>

4. Training

As the YOLO model relies on Brainchip AkidaNet/ImageNet network, it is possible to perform transfer learning from ImageNet pretrained weights when training a YOLO model. See the PlantVillage transfer learning example for a detail explanation on transfer learning principles.

When using transfer learning for YOLO training, we advise to proceed in several steps that include model calibration:

  • instantiate the yolo_base model and load AkidaNet/ImageNet pretrained float weights,

akida_models create -s yolo_akidanet_voc.h5 yolo_base --classes 2 \
         --base_weights akidanet_imagenet_224_alpha_50.h5
  • freeze the AkidaNet layers and perform training,

yolo_train train -d voc_preprocessed.pkl -m yolo_akidanet_voc.h5 \
    -ap voc_anchors.pkl -e 25 -fb 1conv -s yolo_akidanet_voc.h5
  • quantize the network, create data for calibration and calibrate,

cnn2snn quantize -m yolo_akidanet_voc.h5 -iq 8 -wq 4 -aq 4
yolo_train extract -d voc_preprocessed.pkl -ap voc_anchors.pkl -b 1024 -o voc_samples.npz \
    -m yolo_akidanet_voc_iq8_wq4_aq4.h5
cnn2snn calibrate adaround -sa voc_samples.npz -b 128 -e 500 -lr 1e-3 \
    -m yolo_akidanet_voc_iq8_wq4_aq4.h5
  • tune the model to recover accuracy.

yolo_train tune -d voc_preprocessed.pkl \
    -m yolo_akidanet_voc_iq8_wq4_aq4_adaround_calibrated.h5 -ap voc_anchors.pkl \
    -e 10 -s yolo_akidanet_voc_iq8_wq4_aq4.h5

Note

  • voc_anchors.pkl is obtained saving the output of the generate_anchors call to a pickle file,

  • voc_preprocessed.pkl is obtained saving training data, validation data (obtained using parse_voc_annotations) and labels list (i.e [“car”, “person”]) into a pickle file.

Even if transfer learning should be the preferred way to train a YOLO model, it has been observed that for some datasets training all layers from scratch gives better results. That is the case for our YOLO WiderFace model to detect faces. In such a case, the training pipeline to follow is described in the typical training scenario.

5. Performance

The model zoo also contains an helper method that allows to create a YOLO model for VOC and load pretrained weights for the car and person detection task and the corresponding anchors. The anchors are used to interpret the model outputs.

The metric used to evaluate YOLO is the mean average precision (mAP) which is the percentage of correct prediction and is given for an intersection over union (IoU) ratio. Scores in this example are given for the standard IoU of 0.5 meaning that a detection is considered valid if the intersection over union ratio with its ground truth equivalent is above 0.5.

Note

A call to evaluate_map will preprocess the images, make the call to Model.predict and use decode_output before computing precision for all classes.

Reported performanced for all training steps are as follows:

Float

8/4/4 Calibrated

8/4/4 Tuned

Global mAP

38.38 %

32.88 %

38.83 %

from timeit import default_timer as timer
from akida_models import yolo_voc_pretrained
from akida_models.detection.map_evaluation import MapEvaluation

# Load the pretrained model along with anchors
model_keras, anchors = yolo_voc_pretrained()

# Define the final reshape and build the model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model_keras.output)
model_keras = Model(model_keras.input, output)

# Create the mAP evaluator object
num_images = 100

map_evaluator = MapEvaluation(model_keras, val_data[:num_images], labels,
                              anchors)

# Compute the scores for all validation images
start = timer()
mAP, average_precisions = map_evaluator.evaluate_map()
end = timer()

for label, average_precision in average_precisions.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP))
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')

Out:

Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_anchors.pkl

16384/126 [============================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================] - 0s 0us/step
Downloading data from http://data.brainchip.com/models/yolo/yolo_akidanet_voc_iq8_wq4_aq4.h5

   16384/14328592 [..............................] - ETA: 12s
  262144/14328592 [..............................] - ETA: 3s 
  606208/14328592 [>.............................] - ETA: 2s
  942080/14328592 [>.............................] - ETA: 2s
 1400832/14328592 [=>............................] - ETA: 2s
 1646592/14328592 [==>...........................] - ETA: 2s
 2146304/14328592 [===>..........................] - ETA: 1s
 2506752/14328592 [====>.........................] - ETA: 1s
 2891776/14328592 [=====>........................] - ETA: 1s
 3284992/14328592 [=====>........................] - ETA: 1s
 3686400/14328592 [======>.......................] - ETA: 1s
 4096000/14328592 [=======>......................] - ETA: 1s
 4489216/14328592 [========>.....................] - ETA: 1s
 4923392/14328592 [=========>....................] - ETA: 1s
 5357568/14328592 [==========>...................] - ETA: 1s
 5783552/14328592 [===========>..................] - ETA: 1s
 6250496/14328592 [============>.................] - ETA: 1s
 6537216/14328592 [============>.................] - ETA: 1s
 6864896/14328592 [=============>................] - ETA: 1s
 7217152/14328592 [==============>...............] - ETA: 0s
 7544832/14328592 [==============>...............] - ETA: 0s
 7905280/14328592 [===============>..............] - ETA: 0s
 8249344/14328592 [================>.............] - ETA: 0s
 8568832/14328592 [================>.............] - ETA: 0s
 8962048/14328592 [=================>............] - ETA: 0s
 9322496/14328592 [==================>...........] - ETA: 0s
 9715712/14328592 [===================>..........] - ETA: 0s
10108928/14328592 [====================>.........] - ETA: 0s
10403840/14328592 [====================>.........] - ETA: 0s
10706944/14328592 [=====================>........] - ETA: 0s
10985472/14328592 [======================>.......] - ETA: 0s
11304960/14328592 [======================>.......] - ETA: 0s
11632640/14328592 [=======================>......] - ETA: 0s
11952128/14328592 [========================>.....] - ETA: 0s
12279808/14328592 [========================>.....] - ETA: 0s
12623872/14328592 [=========================>....] - ETA: 0s
12976128/14328592 [==========================>...] - ETA: 0s
13328384/14328592 [==========================>...] - ETA: 0s
13697024/14328592 [===========================>..] - ETA: 0s
14073856/14328592 [============================>.] - ETA: 0s
14336000/14328592 [==============================] - 2s 0us/step

14344192/14328592 [==============================] - 2s 0us/step
car 0.3777
person 0.3665
mAP: 0.3721
Keras inference on 100 images took 4.35 s.

6. Conversion to Akida

6.1 Convert to Akida model

Check model compatibility before akida conversion

from cnn2snn import check_model_compatibility

compat = check_model_compatibility(model_keras, False)

Out:

The Keras quantized model is not compatible for a conversion to an Akida model:
 The Reshape layer YOLO_output can only be used to transform a tensor of shape (N,) to a tensor of shape (1, 1, N), and vice-versa. Receives input_shape (7, 7, 35) and output_shape (7, 7, 5, 7).

The last YOLO_output layer that was added for splitting channels into values for each box must be removed before akida conversion.

# Rebuild a model without the last layer
compatible_model = Model(model_keras.input, model_keras.layers[-2].output)

When converting to an Akida model, we just need to pass the Keras model and the input scaling that was used during training to cnn2snn.convert. In YOLO preprocess_image function, images are zero centered and normalized between [-1, 1] hence the scaling values.

from cnn2snn import convert

model_akida = convert(compatible_model)
model_akida.summary()

Out:

                 Model Summary
________________________________________________
Input shape    Output shape  Sequences  Layers
================================================
[224, 224, 3]  [7, 7, 35]    1          18
________________________________________________

              SW/conv_0-detection_layer (Software)
_________________________________________________________________
Layer (type)                 Output shape    Kernel shape
=================================================================
conv_0 (InputConv.)          [112, 112, 16]  (3, 3, 3, 16)
_________________________________________________________________
conv_1 (Conv.)               [112, 112, 32]  (3, 3, 16, 32)
_________________________________________________________________
conv_2 (Conv.)               [56, 56, 64]    (3, 3, 32, 64)
_________________________________________________________________
conv_3 (Conv.)               [56, 56, 64]    (3, 3, 64, 64)
_________________________________________________________________
separable_4 (Sep.Conv.)      [28, 28, 128]   (3, 3, 64, 1)
_________________________________________________________________
                                             (1, 1, 64, 128)
_________________________________________________________________
separable_5 (Sep.Conv.)      [28, 28, 128]   (3, 3, 128, 1)
_________________________________________________________________
                                             (1, 1, 128, 128)
_________________________________________________________________
separable_6 (Sep.Conv.)      [14, 14, 256]   (3, 3, 128, 1)
_________________________________________________________________
                                             (1, 1, 128, 256)
_________________________________________________________________
separable_7 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_8 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_9 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_10 (Sep.Conv.)     [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_11 (Sep.Conv.)     [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_12 (Sep.Conv.)     [7, 7, 512]     (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 512)
_________________________________________________________________
separable_13 (Sep.Conv.)     [7, 7, 512]     (3, 3, 512, 1)
_________________________________________________________________
                                             (1, 1, 512, 512)
_________________________________________________________________
1conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 512, 1)
_________________________________________________________________
                                             (1, 1, 512, 1024)
_________________________________________________________________
2conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 1024)
_________________________________________________________________
3conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 1024)
_________________________________________________________________
detection_layer (Sep.Conv.)  [7, 7, 35]      (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 35)
_________________________________________________________________

6.1 Check performance

Akida model accuracy is tested on the first n images of the validation set.

The table below summarizes the expected results:

#Images

Keras mAP

Akida mAP

100

38.80 %

34.26 %

1000

40.11 %

39.35 %

2500

38.83 %

38.85 %

# Create the mAP evaluator object
map_evaluator_ak = MapEvaluation(model_akida,
                                 val_data[:num_images],
                                 labels,
                                 anchors,
                                 is_keras_model=False)

# Compute the scores for all validation images
start = timer()
mAP_ak, average_precisions_ak = map_evaluator_ak.evaluate_map()
end = timer()

for label, average_precision in average_precisions_ak.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP_ak))
print(f'Akida inference on {num_images} images took {end-start:.2f} s.\n')

Out:

car 0.3789
person 0.3336
mAP: 0.3562
Akida inference on 100 images took 14.53 s.

6.2 Show predictions for a random image

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches

from akida_models.detection.processing import load_image, preprocess_image, decode_output

# Take a random test image
i = np.random.randint(len(val_data))

input_shape = model_akida.layers[0].input_dims

# Load the image
raw_image = load_image(val_data[i]['image_path'])

# Keep the original image size for later bounding boxes rescaling
raw_height, raw_width, _ = raw_image.shape

# Pre-process the image
image = preprocess_image(raw_image, input_shape)
input_image = image[np.newaxis, :].astype(np.uint8)

# Call evaluate on the image
pots = model_akida.predict(input_image)[0]

# Reshape the potentials to prepare for decoding
h, w, c = pots.shape
pots = pots.reshape((h, w, len(anchors), 4 + 1 + len(labels)))

# Decode potentials into bounding boxes
raw_boxes = decode_output(pots, anchors, len(labels))

# Rescale boxes to the original image size
pred_boxes = np.array([[
    box.x1 * raw_width, box.y1 * raw_height, box.x2 * raw_width,
    box.y2 * raw_height,
    box.get_label(),
    box.get_score()
] for box in raw_boxes])

fig = plt.figure(num='VOC2012 car and person detection by Akida runtime')
ax = fig.subplots(1)
img_plot = ax.imshow(np.zeros(raw_image.shape, dtype=np.uint8))
img_plot.set_data(raw_image)

for box in pred_boxes:
    rect = patches.Rectangle((box[0], box[1]),
                             box[2] - box[0],
                             box[3] - box[1],
                             linewidth=1,
                             edgecolor='r',
                             facecolor='none')
    ax.add_patch(rect)
    class_score = ax.text(box[0],
                          box[1] - 5,
                          f"{labels[int(box[4])]} - {box[5]:.2f}",
                          color='red')

plt.axis('off')
plt.show()
plot 5 voc yolo detection

Total running time of the script: ( 0 minutes 57.452 seconds)

Gallery generated by Sphinx-Gallery