Note
Click here to download the full example code
YOLO/PASCAL-VOC detection tutorial
This tutorial demonstrates that Akida can perform object detection using a state-of-the-art model architecture. This is illustrated using a subset of the PASCAL-VOC 2007 dataset with “car” and “person” classes only. The YOLOv2 architecture from Redmon et al (2016) has been chosen to tackle this object detection problem.
1. Introduction
1.1 Object detection
Object detection is a computer vision task that combines two elemental tasks:
object classification that consists in assigning a class label to an image like shown in the AkidaNet/ImageNet inference example
object localization that consists in drawing a bounding box around one or several objects in an image
One can learn more about the subject reading this introduction to object detection blog article.
1.2 YOLO key concepts
You Only Look Once (YOLO) is a deep neural network architecture dedicated to object detection.
As opposed to classic networks that handle object detection, YOLO predicts bounding boxes (localization task) and class probabilities (classification task) from a single neural network in a single evaluation. The object detection task is reduced to a regression problem to spatially separated boxes and associated class probabilities.
YOLO base concept is to divide an input image into regions, forming a grid, and to predict bounding boxes and probabilities for each region. The bounding boxes are weighted by the prediction probabilities.
YOLO also uses the concept of “anchors boxes” or “prior boxes”. The network does not actually predict the actual bounding boxes but offsets from anchors boxes which are templates (width/height ratio) computed by clustering the dimensions of the ground truth boxes from the training dataset. The anchors then represent the average shape and size of the objects to detect. More details on the anchors boxes concept are given in this blog article.
Additional information about YOLO can be found on the Darknet website and source code for the preprocessing and postprocessing functions that are included in akida_models package (see the processing section in the model zoo) is largely inspired from experiencor github.
2. Preprocessing tools
As this example focuses on car and person detection only, a subset of VOC has been prepared with test images from VOC2007 that contains at least one of the occurence of the two classes. Just like the VOC dataset, the subset contains an image folder, an annotation folder and a text file listing the file names of interest.
The YOLO toolkit offers several methods to prepare data for processing, see load_image, preprocess_image or parse_voc_annotations.
import os
from tensorflow.keras.utils import get_file
from akida_models.detection.processing import parse_voc_annotations
# Download validation set from Brainchip data server
data_path = get_file(
"voc_test_car_person.tar.gz",
"http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz",
cache_subdir='datasets/voc',
extract=True)
data_dir = os.path.dirname(data_path)
gt_folder = os.path.join(data_dir, 'voc_test_car_person', 'Annotations')
image_folder = os.path.join(data_dir, 'voc_test_car_person', 'JPEGImages')
file_path = os.path.join(
data_dir, 'voc_test_car_person', 'test_car_person.txt')
labels = ['car', 'person']
val_data = parse_voc_annotations(gt_folder, image_folder, file_path, labels)
print("Loaded VOC2007 test data for car and person classes: "
f"{len(val_data)} images.")
Out:
Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz
16384/221551911 [..............................] - ETA: 3:18
286720/221551911 [..............................] - ETA: 50s
811008/221551911 [..............................] - ETA: 31s
1376256/221551911 [..............................] - ETA: 26s
1966080/221551911 [..............................] - ETA: 24s
2547712/221551911 [..............................] - ETA: 24s
2990080/221551911 [..............................] - ETA: 25s
3432448/221551911 [..............................] - ETA: 25s
3743744/221551911 [..............................] - ETA: 26s
4071424/221551911 [..............................] - ETA: 26s
4390912/221551911 [..............................] - ETA: 27s
4734976/221551911 [..............................] - ETA: 27s
5079040/221551911 [..............................] - ETA: 27s
5431296/221551911 [..............................] - ETA: 28s
5799936/221551911 [..............................] - ETA: 28s
6168576/221551911 [..............................] - ETA: 28s
6561792/221551911 [..............................] - ETA: 28s
6946816/221551911 [..............................] - ETA: 28s
7340032/221551911 [..............................] - ETA: 27s
7749632/221551911 [>.............................] - ETA: 27s
8142848/221551911 [>.............................] - ETA: 27s
8577024/221551911 [>.............................] - ETA: 27s
9003008/221551911 [>.............................] - ETA: 27s
9445376/221551911 [>.............................] - ETA: 27s
9879552/221551911 [>.............................] - ETA: 27s
10338304/221551911 [>.............................] - ETA: 26s
10797056/221551911 [>.............................] - ETA: 26s
11264000/221551911 [>.............................] - ETA: 26s
11739136/221551911 [>.............................] - ETA: 26s
12230656/221551911 [>.............................] - ETA: 25s
12713984/221551911 [>.............................] - ETA: 25s
13172736/221551911 [>.............................] - ETA: 25s
13672448/221551911 [>.............................] - ETA: 25s
14196736/221551911 [>.............................] - ETA: 25s
14688256/221551911 [>.............................] - ETA: 24s
15220736/221551911 [=>............................] - ETA: 24s
15761408/221551911 [=>............................] - ETA: 24s
16293888/221551911 [=>............................] - ETA: 24s
16850944/221551911 [=>............................] - ETA: 23s
17408000/221551911 [=>............................] - ETA: 23s
17981440/221551911 [=>............................] - ETA: 23s
18554880/221551911 [=>............................] - ETA: 23s
19144704/221551911 [=>............................] - ETA: 22s
19734528/221551911 [=>............................] - ETA: 22s
20324352/221551911 [=>............................] - ETA: 22s
20914176/221551911 [=>............................] - ETA: 22s
21504000/221551911 [=>............................] - ETA: 22s
22093824/221551911 [=>............................] - ETA: 21s
22667264/221551911 [==>...........................] - ETA: 21s
23257088/221551911 [==>...........................] - ETA: 21s
23846912/221551911 [==>...........................] - ETA: 21s
24436736/221551911 [==>...........................] - ETA: 21s
25026560/221551911 [==>...........................] - ETA: 20s
25616384/221551911 [==>...........................] - ETA: 20s
26206208/221551911 [==>...........................] - ETA: 20s
26804224/221551911 [==>...........................] - ETA: 20s
27394048/221551911 [==>...........................] - ETA: 20s
27983872/221551911 [==>...........................] - ETA: 20s
28573696/221551911 [==>...........................] - ETA: 20s
29155328/221551911 [==>...........................] - ETA: 19s
29753344/221551911 [===>..........................] - ETA: 19s
30343168/221551911 [===>..........................] - ETA: 19s
30932992/221551911 [===>..........................] - ETA: 19s
31514624/221551911 [===>..........................] - ETA: 19s
31703040/221551911 [===>..........................] - ETA: 19s
32260096/221551911 [===>..........................] - ETA: 19s
32792576/221551911 [===>..........................] - ETA: 19s
33316864/221551911 [===>..........................] - ETA: 19s
33865728/221551911 [===>..........................] - ETA: 19s
34406400/221551911 [===>..........................] - ETA: 19s
34955264/221551911 [===>..........................] - ETA: 19s
35495936/221551911 [===>..........................] - ETA: 19s
36061184/221551911 [===>..........................] - ETA: 18s
36626432/221551911 [===>..........................] - ETA: 18s
37183488/221551911 [====>.........................] - ETA: 18s
37756928/221551911 [====>.........................] - ETA: 18s
38330368/221551911 [====>.........................] - ETA: 18s
38895616/221551911 [====>.........................] - ETA: 18s
39477248/221551911 [====>.........................] - ETA: 18s
40042496/221551911 [====>.........................] - ETA: 18s
40624128/221551911 [====>.........................] - ETA: 18s
41205760/221551911 [====>.........................] - ETA: 18s
41803776/221551911 [====>.........................] - ETA: 17s
42377216/221551911 [====>.........................] - ETA: 17s
42967040/221551911 [====>.........................] - ETA: 17s
43524096/221551911 [====>.........................] - ETA: 17s
43941888/221551911 [====>.........................] - ETA: 17s
44376064/221551911 [=====>........................] - ETA: 17s
44736512/221551911 [=====>........................] - ETA: 17s
45096960/221551911 [=====>........................] - ETA: 17s
45465600/221551911 [=====>........................] - ETA: 17s
45842432/221551911 [=====>........................] - ETA: 17s
46219264/221551911 [=====>........................] - ETA: 17s
46620672/221551911 [=====>........................] - ETA: 17s
47022080/221551911 [=====>........................] - ETA: 17s
47439872/221551911 [=====>........................] - ETA: 17s
47857664/221551911 [=====>........................] - ETA: 17s
48275456/221551911 [=====>........................] - ETA: 17s
48701440/221551911 [=====>........................] - ETA: 17s
49143808/221551911 [=====>........................] - ETA: 17s
49586176/221551911 [=====>........................] - ETA: 17s
50036736/221551911 [=====>........................] - ETA: 17s
50503680/221551911 [=====>........................] - ETA: 17s
50962432/221551911 [=====>........................] - ETA: 17s
51445760/221551911 [=====>........................] - ETA: 17s
51929088/221551911 [======>.......................] - ETA: 17s
52428800/221551911 [======>.......................] - ETA: 17s
52928512/221551911 [======>.......................] - ETA: 17s
53436416/221551911 [======>.......................] - ETA: 17s
53952512/221551911 [======>.......................] - ETA: 17s
54484992/221551911 [======>.......................] - ETA: 17s
55017472/221551911 [======>.......................] - ETA: 17s
55558144/221551911 [======>.......................] - ETA: 17s
56090624/221551911 [======>.......................] - ETA: 17s
56639488/221551911 [======>.......................] - ETA: 16s
57196544/221551911 [======>.......................] - ETA: 16s
57761792/221551911 [======>.......................] - ETA: 16s
58318848/221551911 [======>.......................] - ETA: 16s
58892288/221551911 [======>.......................] - ETA: 16s
59473920/221551911 [=======>......................] - ETA: 16s
60047360/221551911 [=======>......................] - ETA: 16s
60628992/221551911 [=======>......................] - ETA: 16s
61210624/221551911 [=======>......................] - ETA: 16s
61800448/221551911 [=======>......................] - ETA: 16s
62398464/221551911 [=======>......................] - ETA: 16s
62988288/221551911 [=======>......................] - ETA: 16s
63488000/221551911 [=======>......................] - ETA: 16s
63922176/221551911 [=======>......................] - ETA: 15s
64389120/221551911 [=======>......................] - ETA: 15s
64864256/221551911 [=======>......................] - ETA: 15s
65355776/221551911 [=======>......................] - ETA: 15s
65847296/221551911 [=======>......................] - ETA: 15s
66347008/221551911 [=======>......................] - ETA: 15s
66846720/221551911 [========>.....................] - ETA: 15s
67346432/221551911 [========>.....................] - ETA: 15s
67862528/221551911 [========>.....................] - ETA: 15s
68386816/221551911 [========>.....................] - ETA: 15s
68911104/221551911 [========>.....................] - ETA: 15s
69451776/221551911 [========>.....................] - ETA: 15s
69992448/221551911 [========>.....................] - ETA: 15s
70541312/221551911 [========>.....................] - ETA: 15s
71024640/221551911 [========>.....................] - ETA: 15s
71557120/221551911 [========>.....................] - ETA: 15s
72015872/221551911 [========>.....................] - ETA: 15s
72531968/221551911 [========>.....................] - ETA: 15s
73105408/221551911 [========>.....................] - ETA: 15s
73670656/221551911 [========>.....................] - ETA: 14s
73998336/221551911 [=========>....................] - ETA: 14s
74473472/221551911 [=========>....................] - ETA: 14s
74915840/221551911 [=========>....................] - ETA: 14s
75350016/221551911 [=========>....................] - ETA: 14s
75776000/221551911 [=========>....................] - ETA: 14s
76201984/221551911 [=========>....................] - ETA: 14s
76636160/221551911 [=========>....................] - ETA: 14s
77119488/221551911 [=========>....................] - ETA: 14s
77586432/221551911 [=========>....................] - ETA: 14s
78061568/221551911 [=========>....................] - ETA: 14s
78561280/221551911 [=========>....................] - ETA: 14s
79060992/221551911 [=========>....................] - ETA: 14s
79577088/221551911 [=========>....................] - ETA: 14s
80076800/221551911 [=========>....................] - ETA: 14s
80592896/221551911 [=========>....................] - ETA: 14s
81133568/221551911 [=========>....................] - ETA: 14s
81674240/221551911 [==========>...................] - ETA: 14s
82223104/221551911 [==========>...................] - ETA: 14s
82780160/221551911 [==========>...................] - ETA: 14s
83337216/221551911 [==========>...................] - ETA: 14s
83894272/221551911 [==========>...................] - ETA: 13s
84467712/221551911 [==========>...................] - ETA: 13s
85041152/221551911 [==========>...................] - ETA: 13s
85630976/221551911 [==========>...................] - ETA: 13s
86212608/221551911 [==========>...................] - ETA: 13s
86794240/221551911 [==========>...................] - ETA: 13s
87384064/221551911 [==========>...................] - ETA: 13s
87982080/221551911 [==========>...................] - ETA: 13s
88571904/221551911 [==========>...................] - ETA: 13s
89161728/221551911 [===========>..................] - ETA: 13s
89718784/221551911 [===========>..................] - ETA: 13s
90005504/221551911 [===========>..................] - ETA: 13s
90521600/221551911 [===========>..................] - ETA: 13s
91004928/221551911 [===========>..................] - ETA: 13s
91488256/221551911 [===========>..................] - ETA: 13s
91979776/221551911 [===========>..................] - ETA: 13s
92471296/221551911 [===========>..................] - ETA: 13s
92979200/221551911 [===========>..................] - ETA: 12s
93495296/221551911 [===========>..................] - ETA: 12s
94019584/221551911 [===========>..................] - ETA: 12s
94543872/221551911 [===========>..................] - ETA: 12s
95076352/221551911 [===========>..................] - ETA: 12s
95608832/221551911 [===========>..................] - ETA: 12s
96157696/221551911 [============>.................] - ETA: 12s
96706560/221551911 [============>.................] - ETA: 12s
97280000/221551911 [============>.................] - ETA: 12s
97845248/221551911 [============>.................] - ETA: 12s
98418688/221551911 [============>.................] - ETA: 12s
99000320/221551911 [============>.................] - ETA: 12s
99581952/221551911 [============>.................] - ETA: 12s
100163584/221551911 [============>.................] - ETA: 12s
100745216/221551911 [============>.................] - ETA: 12s
101335040/221551911 [============>.................] - ETA: 12s
101924864/221551911 [============>.................] - ETA: 11s
102522880/221551911 [============>.................] - ETA: 11s
103112704/221551911 [============>.................] - ETA: 11s
103702528/221551911 [=============>................] - ETA: 11s
104292352/221551911 [=============>................] - ETA: 11s
104882176/221551911 [=============>................] - ETA: 11s
105480192/221551911 [=============>................] - ETA: 11s
106070016/221551911 [=============>................] - ETA: 11s
106659840/221551911 [=============>................] - ETA: 11s
107257856/221551911 [=============>................] - ETA: 11s
107847680/221551911 [=============>................] - ETA: 11s
108437504/221551911 [=============>................] - ETA: 11s
109027328/221551911 [=============>................] - ETA: 11s
109617152/221551911 [=============>................] - ETA: 11s
110215168/221551911 [=============>................] - ETA: 10s
110813184/221551911 [==============>...............] - ETA: 10s
111403008/221551911 [==============>...............] - ETA: 10s
111992832/221551911 [==============>...............] - ETA: 10s
112590848/221551911 [==============>...............] - ETA: 10s
113180672/221551911 [==============>...............] - ETA: 10s
113770496/221551911 [==============>...............] - ETA: 10s
113975296/221551911 [==============>...............] - ETA: 10s
114556928/221551911 [==============>...............] - ETA: 10s
115097600/221551911 [==============>...............] - ETA: 10s
115621888/221551911 [==============>...............] - ETA: 10s
116170752/221551911 [==============>...............] - ETA: 10s
116736000/221551911 [==============>...............] - ETA: 10s
117301248/221551911 [==============>...............] - ETA: 10s
117874688/221551911 [==============>...............] - ETA: 10s
118448128/221551911 [===============>..............] - ETA: 10s
119029760/221551911 [===============>..............] - ETA: 10s
119627776/221551911 [===============>..............] - ETA: 10s
120217600/221551911 [===============>..............] - ETA: 9s
120807424/221551911 [===============>..............] - ETA: 9s
121397248/221551911 [===============>..............] - ETA: 9s
121987072/221551911 [===============>..............] - ETA: 9s
122576896/221551911 [===============>..............] - ETA: 9s
123174912/221551911 [===============>..............] - ETA: 9s
123764736/221551911 [===============>..............] - ETA: 9s
124346368/221551911 [===============>..............] - ETA: 9s
124936192/221551911 [===============>..............] - ETA: 9s
125526016/221551911 [===============>..............] - ETA: 9s
126115840/221551911 [================>.............] - ETA: 9s
126697472/221551911 [================>.............] - ETA: 9s
127295488/221551911 [================>.............] - ETA: 9s
127885312/221551911 [================>.............] - ETA: 9s
128475136/221551911 [================>.............] - ETA: 9s
129064960/221551911 [================>.............] - ETA: 8s
129662976/221551911 [================>.............] - ETA: 8s
130252800/221551911 [================>.............] - ETA: 8s
130842624/221551911 [================>.............] - ETA: 8s
131432448/221551911 [================>.............] - ETA: 8s
132022272/221551911 [================>.............] - ETA: 8s
132521984/221551911 [================>.............] - ETA: 8s
132874240/221551911 [================>.............] - ETA: 8s
133136384/221551911 [=================>............] - ETA: 8s
133545984/221551911 [=================>............] - ETA: 8s
133939200/221551911 [=================>............] - ETA: 8s
134348800/221551911 [=================>............] - ETA: 8s
134766592/221551911 [=================>............] - ETA: 8s
135184384/221551911 [=================>............] - ETA: 8s
135618560/221551911 [=================>............] - ETA: 8s
136052736/221551911 [=================>............] - ETA: 8s
136503296/221551911 [=================>............] - ETA: 8s
136953856/221551911 [=================>............] - ETA: 8s
137420800/221551911 [=================>............] - ETA: 8s
137887744/221551911 [=================>............] - ETA: 8s
138362880/221551911 [=================>............] - ETA: 8s
138854400/221551911 [=================>............] - ETA: 8s
139345920/221551911 [=================>............] - ETA: 8s
139845632/221551911 [=================>............] - ETA: 8s
140353536/221551911 [==================>...........] - ETA: 7s
140877824/221551911 [==================>...........] - ETA: 7s
141393920/221551911 [==================>...........] - ETA: 7s
141926400/221551911 [==================>...........] - ETA: 7s
142442496/221551911 [==================>...........] - ETA: 7s
142942208/221551911 [==================>...........] - ETA: 7s
143482880/221551911 [==================>...........] - ETA: 7s
144023552/221551911 [==================>...........] - ETA: 7s
144572416/221551911 [==================>...........] - ETA: 7s
145121280/221551911 [==================>...........] - ETA: 7s
145686528/221551911 [==================>...........] - ETA: 7s
146259968/221551911 [==================>...........] - ETA: 7s
146833408/221551911 [==================>...........] - ETA: 7s
147415040/221551911 [==================>...........] - ETA: 7s
148004864/221551911 [===================>..........] - ETA: 7s
148594688/221551911 [===================>..........] - ETA: 7s
149184512/221551911 [===================>..........] - ETA: 7s
149782528/221551911 [===================>..........] - ETA: 7s
150372352/221551911 [===================>..........] - ETA: 6s
150970368/221551911 [===================>..........] - ETA: 6s
151560192/221551911 [===================>..........] - ETA: 6s
152150016/221551911 [===================>..........] - ETA: 6s
152690688/221551911 [===================>..........] - ETA: 6s
153034752/221551911 [===================>..........] - ETA: 6s
153526272/221551911 [===================>..........] - ETA: 6s
154017792/221551911 [===================>..........] - ETA: 6s
154509312/221551911 [===================>..........] - ETA: 6s
154992640/221551911 [===================>..........] - ETA: 6s
155508736/221551911 [====================>.........] - ETA: 6s
156016640/221551911 [====================>.........] - ETA: 6s
156540928/221551911 [====================>.........] - ETA: 6s
156958720/221551911 [====================>.........] - ETA: 6s
157327360/221551911 [====================>.........] - ETA: 6s
157720576/221551911 [====================>.........] - ETA: 6s
158121984/221551911 [====================>.........] - ETA: 6s
158523392/221551911 [====================>.........] - ETA: 6s
158941184/221551911 [====================>.........] - ETA: 6s
159367168/221551911 [====================>.........] - ETA: 6s
159801344/221551911 [====================>.........] - ETA: 6s
160227328/221551911 [====================>.........] - ETA: 6s
160677888/221551911 [====================>.........] - ETA: 5s
161120256/221551911 [====================>.........] - ETA: 5s
161579008/221551911 [====================>.........] - ETA: 5s
162054144/221551911 [====================>.........] - ETA: 5s
162521088/221551911 [=====================>........] - ETA: 5s
163004416/221551911 [=====================>........] - ETA: 5s
163495936/221551911 [=====================>........] - ETA: 5s
163979264/221551911 [=====================>........] - ETA: 5s
164487168/221551911 [=====================>........] - ETA: 5s
165003264/221551911 [=====================>........] - ETA: 5s
165519360/221551911 [=====================>........] - ETA: 5s
166051840/221551911 [=====================>........] - ETA: 5s
166551552/221551911 [=====================>........] - ETA: 5s
167084032/221551911 [=====================>........] - ETA: 5s
167624704/221551911 [=====================>........] - ETA: 5s
168181760/221551911 [=====================>........] - ETA: 5s
168747008/221551911 [=====================>........] - ETA: 5s
169320448/221551911 [=====================>........] - ETA: 5s
169893888/221551911 [======================>.......] - ETA: 5s
170475520/221551911 [======================>.......] - ETA: 5s
171057152/221551911 [======================>.......] - ETA: 4s
171638784/221551911 [======================>.......] - ETA: 4s
172236800/221551911 [======================>.......] - ETA: 4s
172826624/221551911 [======================>.......] - ETA: 4s
173416448/221551911 [======================>.......] - ETA: 4s
174006272/221551911 [======================>.......] - ETA: 4s
174596096/221551911 [======================>.......] - ETA: 4s
175185920/221551911 [======================>.......] - ETA: 4s
175775744/221551911 [======================>.......] - ETA: 4s
176365568/221551911 [======================>.......] - ETA: 4s
176955392/221551911 [======================>.......] - ETA: 4s
177545216/221551911 [=======================>......] - ETA: 4s
178135040/221551911 [=======================>......] - ETA: 4s
178733056/221551911 [=======================>......] - ETA: 4s
179331072/221551911 [=======================>......] - ETA: 4s
179929088/221551911 [=======================>......] - ETA: 4s
180518912/221551911 [=======================>......] - ETA: 4s
181108736/221551911 [=======================>......] - ETA: 3s
181698560/221551911 [=======================>......] - ETA: 3s
182288384/221551911 [=======================>......] - ETA: 3s
182878208/221551911 [=======================>......] - ETA: 3s
183181312/221551911 [=======================>......] - ETA: 3s
183689216/221551911 [=======================>......] - ETA: 3s
184147968/221551911 [=======================>......] - ETA: 3s
184623104/221551911 [=======================>......] - ETA: 3s
185090048/221551911 [========================>.....] - ETA: 3s
185565184/221551911 [========================>.....] - ETA: 3s
186064896/221551911 [========================>.....] - ETA: 3s
186548224/221551911 [========================>.....] - ETA: 3s
187047936/221551911 [========================>.....] - ETA: 3s
187564032/221551911 [========================>.....] - ETA: 3s
188047360/221551911 [========================>.....] - ETA: 3s
188571648/221551911 [========================>.....] - ETA: 3s
189071360/221551911 [========================>.....] - ETA: 3s
189587456/221551911 [========================>.....] - ETA: 3s
190103552/221551911 [========================>.....] - ETA: 3s
190521344/221551911 [========================>.....] - ETA: 3s
190906368/221551911 [========================>.....] - ETA: 3s
191340544/221551911 [========================>.....] - ETA: 2s
191766528/221551911 [========================>.....] - ETA: 2s
192208896/221551911 [=========================>....] - ETA: 2s
192659456/221551911 [=========================>....] - ETA: 2s
193011712/221551911 [=========================>....] - ETA: 2s
193355776/221551911 [=========================>....] - ETA: 2s
193683456/221551911 [=========================>....] - ETA: 2s
194019328/221551911 [=========================>....] - ETA: 2s
194379776/221551911 [=========================>....] - ETA: 2s
194732032/221551911 [=========================>....] - ETA: 2s
195100672/221551911 [=========================>....] - ETA: 2s
195477504/221551911 [=========================>....] - ETA: 2s
195731456/221551911 [=========================>....] - ETA: 2s
196034560/221551911 [=========================>....] - ETA: 2s
196321280/221551911 [=========================>....] - ETA: 2s
196616192/221551911 [=========================>....] - ETA: 2s
196935680/221551911 [=========================>....] - ETA: 2s
197230592/221551911 [=========================>....] - ETA: 2s
197550080/221551911 [=========================>....] - ETA: 2s
197877760/221551911 [=========================>....] - ETA: 2s
198221824/221551911 [=========================>....] - ETA: 2s
198549504/221551911 [=========================>....] - ETA: 2s
198828032/221551911 [=========================>....] - ETA: 2s
199147520/221551911 [=========================>....] - ETA: 2s
199393280/221551911 [=========================>....] - ETA: 2s
199680000/221551911 [==========================>...] - ETA: 2s
199958528/221551911 [==========================>...] - ETA: 2s
200245248/221551911 [==========================>...] - ETA: 2s
200548352/221551911 [==========================>...] - ETA: 2s
200843264/221551911 [==========================>...] - ETA: 2s
201154560/221551911 [==========================>...] - ETA: 2s
201465856/221551911 [==========================>...] - ETA: 2s
201801728/221551911 [==========================>...] - ETA: 1s
202129408/221551911 [==========================>...] - ETA: 1s
202489856/221551911 [==========================>...] - ETA: 1s
202850304/221551911 [==========================>...] - ETA: 1s
203218944/221551911 [==========================>...] - ETA: 1s
203571200/221551911 [==========================>...] - ETA: 1s
203964416/221551911 [==========================>...] - ETA: 1s
204349440/221551911 [==========================>...] - ETA: 1s
204750848/221551911 [==========================>...] - ETA: 1s
205152256/221551911 [==========================>...] - ETA: 1s
205570048/221551911 [==========================>...] - ETA: 1s
205996032/221551911 [==========================>...] - ETA: 1s
206430208/221551911 [==========================>...] - ETA: 1s
206864384/221551911 [===========================>..] - ETA: 1s
207306752/221551911 [===========================>..] - ETA: 1s
207749120/221551911 [===========================>..] - ETA: 1s
208216064/221551911 [===========================>..] - ETA: 1s
208691200/221551911 [===========================>..] - ETA: 1s
209174528/221551911 [===========================>..] - ETA: 1s
209666048/221551911 [===========================>..] - ETA: 1s
210165760/221551911 [===========================>..] - ETA: 1s
210665472/221551911 [===========================>..] - ETA: 1s
211173376/221551911 [===========================>..] - ETA: 1s
211697664/221551911 [===========================>..] - ETA: 1s
212180992/221551911 [===========================>..] - ETA: 0s
212713472/221551911 [===========================>..] - ETA: 0s
213254144/221551911 [===========================>..] - ETA: 0s
213803008/221551911 [===========================>..] - ETA: 0s
214351872/221551911 [============================>.] - ETA: 0s
214917120/221551911 [============================>.] - ETA: 0s
215474176/221551911 [============================>.] - ETA: 0s
216031232/221551911 [============================>.] - ETA: 0s
216588288/221551911 [============================>.] - ETA: 0s
217161728/221551911 [============================>.] - ETA: 0s
217743360/221551911 [============================>.] - ETA: 0s
218333184/221551911 [============================>.] - ETA: 0s
218923008/221551911 [============================>.] - ETA: 0s
219512832/221551911 [============================>.] - ETA: 0s
220102656/221551911 [============================>.] - ETA: 0s
220692480/221551911 [============================>.] - ETA: 0s
221224960/221551911 [============================>.] - ETA: 0s
221257728/221551911 [============================>.] - ETA: 0s
221306880/221551911 [============================>.] - ETA: 0s
221356032/221551911 [============================>.] - ETA: 0s
221405184/221551911 [============================>.] - ETA: 0s
221437952/221551911 [============================>.] - ETA: 0s
221495296/221551911 [============================>.] - ETA: 0s
221544448/221551911 [============================>.] - ETA: 0s
221552640/221551911 [==============================] - 23s 0us/step
221560832/221551911 [==============================] - 23s 0us/step
Loaded VOC2007 test data for car and person classes: 2500 images.
Anchors can also be computed easily using YOLO toolkit.
Note
The following code is given as an example. In a real use case scenario, anchors are computed on the training dataset.
from akida_models.detection.generate_anchors import generate_anchors
num_anchors = 5
grid_size = (7, 7)
anchors_example = generate_anchors(val_data, num_anchors, grid_size)
Out:
Average IOU for 5 anchors: 0.61
Anchors: [[0.6377, 1.13475], [1.25899, 2.81096], [2.28008, 2.94847], [3.64754, 5.01809], [5.20856, 5.74005]]
3. Model architecture
The model zoo contains a YOLO model that is built upon the AkidaNet architecture and 3 separable convolutional layers at the top for bounding box and class estimation followed by a final separable convolutional which is the detection layer. Note that for efficiency, the alpha parameter in AkidaNet (network width or number of filter in each layer) is set to 0.5.
from akida_models import yolo_base
# Create a yolo model for 2 classes with 5 anchors and grid size of 7
classes = 2
model = yolo_base(input_shape=(224, 224, 3),
classes=classes,
nb_box=num_anchors,
alpha=0.5)
model.summary()
Out:
Model: "yolo_base"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_28 (InputLayer) [(None, 224, 224, 3)] 0
rescaling_1 (Rescaling) (None, 224, 224, 3) 0
conv_0 (Conv2D) (None, 112, 112, 16) 432
conv_0_BN (BatchNormalizati (None, 112, 112, 16) 64
on)
conv_0_relu (ReLU) (None, 112, 112, 16) 0
conv_1 (Conv2D) (None, 112, 112, 32) 4608
conv_1_BN (BatchNormalizati (None, 112, 112, 32) 128
on)
conv_1_relu (ReLU) (None, 112, 112, 32) 0
conv_2 (Conv2D) (None, 56, 56, 64) 18432
conv_2_BN (BatchNormalizati (None, 56, 56, 64) 256
on)
conv_2_relu (ReLU) (None, 56, 56, 64) 0
conv_3 (Conv2D) (None, 56, 56, 64) 36864
conv_3_BN (BatchNormalizati (None, 56, 56, 64) 256
on)
conv_3_relu (ReLU) (None, 56, 56, 64) 0
separable_4 (SeparableConv2 (None, 28, 28, 128) 8768
D)
separable_4_BN (BatchNormal (None, 28, 28, 128) 512
ization)
separable_4_relu (ReLU) (None, 28, 28, 128) 0
separable_5 (SeparableConv2 (None, 28, 28, 128) 17536
D)
separable_5_BN (BatchNormal (None, 28, 28, 128) 512
ization)
separable_5_relu (ReLU) (None, 28, 28, 128) 0
separable_6 (SeparableConv2 (None, 14, 14, 256) 33920
D)
separable_6_BN (BatchNormal (None, 14, 14, 256) 1024
ization)
separable_6_relu (ReLU) (None, 14, 14, 256) 0
separable_7 (SeparableConv2 (None, 14, 14, 256) 67840
D)
separable_7_BN (BatchNormal (None, 14, 14, 256) 1024
ization)
separable_7_relu (ReLU) (None, 14, 14, 256) 0
separable_8 (SeparableConv2 (None, 14, 14, 256) 67840
D)
separable_8_BN (BatchNormal (None, 14, 14, 256) 1024
ization)
separable_8_relu (ReLU) (None, 14, 14, 256) 0
separable_9 (SeparableConv2 (None, 14, 14, 256) 67840
D)
separable_9_BN (BatchNormal (None, 14, 14, 256) 1024
ization)
separable_9_relu (ReLU) (None, 14, 14, 256) 0
separable_10 (SeparableConv (None, 14, 14, 256) 67840
2D)
separable_10_BN (BatchNorma (None, 14, 14, 256) 1024
lization)
separable_10_relu (ReLU) (None, 14, 14, 256) 0
separable_11 (SeparableConv (None, 14, 14, 256) 67840
2D)
separable_11_BN (BatchNorma (None, 14, 14, 256) 1024
lization)
separable_11_relu (ReLU) (None, 14, 14, 256) 0
separable_12 (SeparableConv (None, 7, 7, 512) 133376
2D)
separable_12_BN (BatchNorma (None, 7, 7, 512) 2048
lization)
separable_12_relu (ReLU) (None, 7, 7, 512) 0
separable_13 (SeparableConv (None, 7, 7, 512) 266752
2D)
separable_13_BN (BatchNorma (None, 7, 7, 512) 2048
lization)
separable_13_relu (ReLU) (None, 7, 7, 512) 0
1conv (SeparableConv2D) (None, 7, 7, 1024) 528896
1conv_BN (BatchNormalizatio (None, 7, 7, 1024) 4096
n)
1conv_relu (ReLU) (None, 7, 7, 1024) 0
2conv (SeparableConv2D) (None, 7, 7, 1024) 1057792
2conv_BN (BatchNormalizatio (None, 7, 7, 1024) 4096
n)
2conv_relu (ReLU) (None, 7, 7, 1024) 0
3conv (SeparableConv2D) (None, 7, 7, 1024) 1057792
3conv_BN (BatchNormalizatio (None, 7, 7, 1024) 4096
n)
3conv_relu (ReLU) (None, 7, 7, 1024) 0
detection_layer (SeparableC (None, 7, 7, 35) 45091
onv2D)
=================================================================
Total params: 3,573,715
Trainable params: 3,561,587
Non-trainable params: 12,128
_________________________________________________________________
The model output can be reshaped to a more natural shape of:
(grid_height, grid_width, anchors_box, 4 + 1 + num_classes)
where the “4 + 1” term represents the coordinates of the estimated bounding boxes (top left x, top left y, width and height) and a confidence score. In other words, the output channels are actually grouped by anchor boxes, and in each group one channel provides either a coordinate, a global confidence score or a class confidence score. This process is done automatically in the decode_output function.
from tensorflow.keras import Model
from tensorflow.keras.layers import Reshape
# Define a reshape output to be added to the YOLO model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
name="YOLO_output")(model.output)
# Build the complete model
full_model = Model(model.input, output)
full_model.output
Out:
<KerasTensor: shape=(None, 7, 7, 5, 7) dtype=float32 (created by layer 'YOLO_output')>
4. Training
As the YOLO model relies on Brainchip AkidaNet/ImageNet network, it is possible to perform transfer learning from ImageNet pretrained weights when training a YOLO model. See the PlantVillage transfer learning example for a detail explanation on transfer learning principles.
When using transfer learning for YOLO training, we advise to proceed in several steps that include model calibration:
instantiate the yolo_base model and load AkidaNet/ImageNet pretrained float weights,
akida_models create -s yolo_akidanet_voc.h5 yolo_base --classes 2 \
--base_weights akidanet_imagenet_224_alpha_50.h5
freeze the AkidaNet layers and perform training,
yolo_train train -d voc_preprocessed.pkl -m yolo_akidanet_voc.h5 \
-ap voc_anchors.pkl -e 25 -fb 1conv -s yolo_akidanet_voc.h5
quantize the network, create data for calibration and calibrate,
cnn2snn quantize -m yolo_akidanet_voc.h5 -iq 8 -wq 4 -aq 4
yolo_train extract -d voc_preprocessed.pkl -ap voc_anchors.pkl -b 1024 -o voc_samples.npz \
-m yolo_akidanet_voc_iq8_wq4_aq4.h5
cnn2snn calibrate adaround -sa voc_samples.npz -b 128 -e 500 -lr 1e-3 \
-m yolo_akidanet_voc_iq8_wq4_aq4.h5
tune the model to recover accuracy.
yolo_train tune -d voc_preprocessed.pkl \
-m yolo_akidanet_voc_iq8_wq4_aq4_adaround_calibrated.h5 -ap voc_anchors.pkl \
-e 10 -s yolo_akidanet_voc_iq8_wq4_aq4.h5
Note
voc_anchors.pkl
is obtained saving the output of the generate_anchors call to a pickle file,voc_preprocessed.pkl
is obtained saving training data, validation data (obtained using parse_voc_annotations) and labels list (i.e [“car”, “person”]) into a pickle file.
Even if transfer learning should be the preferred way to train a YOLO model, it has been observed that for some datasets training all layers from scratch gives better results. That is the case for our YOLO WiderFace model to detect faces. In such a case, the training pipeline to follow is described in the typical training scenario.
5. Performance
The model zoo also contains an helper method that allows to create a YOLO model for VOC and load pretrained weights for the car and person detection task and the corresponding anchors. The anchors are used to interpret the model outputs.
The metric used to evaluate YOLO is the mean average precision (mAP) which is the percentage of correct prediction and is given for an intersection over union (IoU) ratio. Scores in this example are given for the standard IoU of 0.5 meaning that a detection is considered valid if the intersection over union ratio with its ground truth equivalent is above 0.5.
Note
A call to evaluate_map will preprocess the images, make the call to
Model.predict
and use decode_output before computing precision for all classes.
Reported performanced for all training steps are as follows:
Float |
8/4/4 Calibrated |
8/4/4 Tuned |
|
---|---|---|---|
Global mAP |
38.38 % |
32.88 % |
38.83 % |
from timeit import default_timer as timer
from akida_models import yolo_voc_pretrained
from akida_models.detection.map_evaluation import MapEvaluation
# Load the pretrained model along with anchors
model_keras, anchors = yolo_voc_pretrained()
# Define the final reshape and build the model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
name="YOLO_output")(model_keras.output)
model_keras = Model(model_keras.input, output)
# Create the mAP evaluator object
num_images = 100
map_evaluator = MapEvaluation(model_keras, val_data[:num_images], labels,
anchors)
# Compute the scores for all validation images
start = timer()
mAP, average_precisions = map_evaluator.evaluate_map()
end = timer()
for label, average_precision in average_precisions.items():
print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP))
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')
Out:
Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_anchors.pkl
16384/126 [============================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================] - 0s 0us/step
Downloading data from http://data.brainchip.com/models/yolo/yolo_akidanet_voc_iq8_wq4_aq4.h5
16384/14328592 [..............................] - ETA: 12s
262144/14328592 [..............................] - ETA: 3s
606208/14328592 [>.............................] - ETA: 2s
942080/14328592 [>.............................] - ETA: 2s
1400832/14328592 [=>............................] - ETA: 2s
1646592/14328592 [==>...........................] - ETA: 2s
2146304/14328592 [===>..........................] - ETA: 1s
2506752/14328592 [====>.........................] - ETA: 1s
2891776/14328592 [=====>........................] - ETA: 1s
3284992/14328592 [=====>........................] - ETA: 1s
3686400/14328592 [======>.......................] - ETA: 1s
4096000/14328592 [=======>......................] - ETA: 1s
4489216/14328592 [========>.....................] - ETA: 1s
4923392/14328592 [=========>....................] - ETA: 1s
5357568/14328592 [==========>...................] - ETA: 1s
5783552/14328592 [===========>..................] - ETA: 1s
6250496/14328592 [============>.................] - ETA: 1s
6537216/14328592 [============>.................] - ETA: 1s
6864896/14328592 [=============>................] - ETA: 1s
7217152/14328592 [==============>...............] - ETA: 0s
7544832/14328592 [==============>...............] - ETA: 0s
7905280/14328592 [===============>..............] - ETA: 0s
8249344/14328592 [================>.............] - ETA: 0s
8568832/14328592 [================>.............] - ETA: 0s
8962048/14328592 [=================>............] - ETA: 0s
9322496/14328592 [==================>...........] - ETA: 0s
9715712/14328592 [===================>..........] - ETA: 0s
10108928/14328592 [====================>.........] - ETA: 0s
10403840/14328592 [====================>.........] - ETA: 0s
10706944/14328592 [=====================>........] - ETA: 0s
10985472/14328592 [======================>.......] - ETA: 0s
11304960/14328592 [======================>.......] - ETA: 0s
11632640/14328592 [=======================>......] - ETA: 0s
11952128/14328592 [========================>.....] - ETA: 0s
12279808/14328592 [========================>.....] - ETA: 0s
12623872/14328592 [=========================>....] - ETA: 0s
12976128/14328592 [==========================>...] - ETA: 0s
13328384/14328592 [==========================>...] - ETA: 0s
13697024/14328592 [===========================>..] - ETA: 0s
14073856/14328592 [============================>.] - ETA: 0s
14336000/14328592 [==============================] - 2s 0us/step
14344192/14328592 [==============================] - 2s 0us/step
car 0.3777
person 0.3665
mAP: 0.3721
Keras inference on 100 images took 4.35 s.
6. Conversion to Akida
6.1 Convert to Akida model
Check model compatibility before akida conversion
from cnn2snn import check_model_compatibility
compat = check_model_compatibility(model_keras, False)
Out:
The Keras quantized model is not compatible for a conversion to an Akida model:
The Reshape layer YOLO_output can only be used to transform a tensor of shape (N,) to a tensor of shape (1, 1, N), and vice-versa. Receives input_shape (7, 7, 35) and output_shape (7, 7, 5, 7).
The last YOLO_output layer that was added for splitting channels into values for each box must be removed before akida conversion.
# Rebuild a model without the last layer
compatible_model = Model(model_keras.input, model_keras.layers[-2].output)
When converting to an Akida model, we just need to pass the Keras model and the input scaling that was used during training to cnn2snn.convert. In YOLO preprocess_image function, images are zero centered and normalized between [-1, 1] hence the scaling values.
from cnn2snn import convert
model_akida = convert(compatible_model)
model_akida.summary()
Out:
Model Summary
________________________________________________
Input shape Output shape Sequences Layers
================================================
[224, 224, 3] [7, 7, 35] 1 18
________________________________________________
SW/conv_0-detection_layer (Software)
_________________________________________________________________
Layer (type) Output shape Kernel shape
=================================================================
conv_0 (InputConv.) [112, 112, 16] (3, 3, 3, 16)
_________________________________________________________________
conv_1 (Conv.) [112, 112, 32] (3, 3, 16, 32)
_________________________________________________________________
conv_2 (Conv.) [56, 56, 64] (3, 3, 32, 64)
_________________________________________________________________
conv_3 (Conv.) [56, 56, 64] (3, 3, 64, 64)
_________________________________________________________________
separable_4 (Sep.Conv.) [28, 28, 128] (3, 3, 64, 1)
_________________________________________________________________
(1, 1, 64, 128)
_________________________________________________________________
separable_5 (Sep.Conv.) [28, 28, 128] (3, 3, 128, 1)
_________________________________________________________________
(1, 1, 128, 128)
_________________________________________________________________
separable_6 (Sep.Conv.) [14, 14, 256] (3, 3, 128, 1)
_________________________________________________________________
(1, 1, 128, 256)
_________________________________________________________________
separable_7 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
_________________________________________________________________
(1, 1, 256, 256)
_________________________________________________________________
separable_8 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
_________________________________________________________________
(1, 1, 256, 256)
_________________________________________________________________
separable_9 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
_________________________________________________________________
(1, 1, 256, 256)
_________________________________________________________________
separable_10 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
_________________________________________________________________
(1, 1, 256, 256)
_________________________________________________________________
separable_11 (Sep.Conv.) [14, 14, 256] (3, 3, 256, 1)
_________________________________________________________________
(1, 1, 256, 256)
_________________________________________________________________
separable_12 (Sep.Conv.) [7, 7, 512] (3, 3, 256, 1)
_________________________________________________________________
(1, 1, 256, 512)
_________________________________________________________________
separable_13 (Sep.Conv.) [7, 7, 512] (3, 3, 512, 1)
_________________________________________________________________
(1, 1, 512, 512)
_________________________________________________________________
1conv (Sep.Conv.) [7, 7, 1024] (3, 3, 512, 1)
_________________________________________________________________
(1, 1, 512, 1024)
_________________________________________________________________
2conv (Sep.Conv.) [7, 7, 1024] (3, 3, 1024, 1)
_________________________________________________________________
(1, 1, 1024, 1024)
_________________________________________________________________
3conv (Sep.Conv.) [7, 7, 1024] (3, 3, 1024, 1)
_________________________________________________________________
(1, 1, 1024, 1024)
_________________________________________________________________
detection_layer (Sep.Conv.) [7, 7, 35] (3, 3, 1024, 1)
_________________________________________________________________
(1, 1, 1024, 35)
_________________________________________________________________
6.1 Check performance
Akida model accuracy is tested on the first n images of the validation set.
The table below summarizes the expected results:
#Images |
Keras mAP |
Akida mAP |
---|---|---|
100 |
38.80 % |
34.26 % |
1000 |
40.11 % |
39.35 % |
2500 |
38.83 % |
38.85 % |
# Create the mAP evaluator object
map_evaluator_ak = MapEvaluation(model_akida,
val_data[:num_images],
labels,
anchors,
is_keras_model=False)
# Compute the scores for all validation images
start = timer()
mAP_ak, average_precisions_ak = map_evaluator_ak.evaluate_map()
end = timer()
for label, average_precision in average_precisions_ak.items():
print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP_ak))
print(f'Akida inference on {num_images} images took {end-start:.2f} s.\n')
Out:
car 0.3789
person 0.3336
mAP: 0.3562
Akida inference on 100 images took 14.53 s.
6.2 Show predictions for a random image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from akida_models.detection.processing import load_image, preprocess_image, decode_output
# Take a random test image
i = np.random.randint(len(val_data))
input_shape = model_akida.layers[0].input_dims
# Load the image
raw_image = load_image(val_data[i]['image_path'])
# Keep the original image size for later bounding boxes rescaling
raw_height, raw_width, _ = raw_image.shape
# Pre-process the image
image = preprocess_image(raw_image, input_shape)
input_image = image[np.newaxis, :].astype(np.uint8)
# Call evaluate on the image
pots = model_akida.predict(input_image)[0]
# Reshape the potentials to prepare for decoding
h, w, c = pots.shape
pots = pots.reshape((h, w, len(anchors), 4 + 1 + len(labels)))
# Decode potentials into bounding boxes
raw_boxes = decode_output(pots, anchors, len(labels))
# Rescale boxes to the original image size
pred_boxes = np.array([[
box.x1 * raw_width, box.y1 * raw_height, box.x2 * raw_width,
box.y2 * raw_height,
box.get_label(),
box.get_score()
] for box in raw_boxes])
fig = plt.figure(num='VOC2012 car and person detection by Akida runtime')
ax = fig.subplots(1)
img_plot = ax.imshow(np.zeros(raw_image.shape, dtype=np.uint8))
img_plot.set_data(raw_image)
for box in pred_boxes:
rect = patches.Rectangle((box[0], box[1]),
box[2] - box[0],
box[3] - box[1],
linewidth=1,
edgecolor='r',
facecolor='none')
ax.add_patch(rect)
class_score = ax.text(box[0],
box[1] - 5,
f"{labels[int(box[4])]} - {box[5]:.2f}",
color='red')
plt.axis('off')
plt.show()

Total running time of the script: ( 0 minutes 57.452 seconds)