YOLO/PASCAL-VOC detection tutorial

This tutorial demonstrates that Akida can perform object detection using a state-of-the-art model architecture. This is illustrated using a subset of the PASCAL-VOC 2007 dataset with “car” and “person” classes only. The YOLOv2 architecture from Redmon et al (2016) has been chosen to tackle this object detection problem.

1. Introduction

1.1 Object detection

Object detection is a computer vision task that combines two elemental tasks:

  • object classification that consists in assigning a class label to an image like shown in the AkidaNet/ImageNet inference example

  • object localization that consists in drawing a bounding box around one or several objects in an image

One can learn more about the subject reading this introduction to object detection blog article.

1.2 YOLO key concepts

You Only Look Once (YOLO) is a deep neural network architecture dedicated to object detection.

As opposed to classic networks that handle object detection, YOLO predicts bounding boxes (localization task) and class probabilities (classification task) from a single neural network in a single evaluation. The object detection task is reduced to a regression problem to spatially separated boxes and associated class probabilities.

YOLO base concept is to divide an input image into regions, forming a grid, and to predict bounding boxes and probabilities for each region. The bounding boxes are weighted by the prediction probabilities.

YOLO also uses the concept of “anchors boxes” or “prior boxes”. The network does not actually predict the actual bounding boxes but offsets from anchors boxes which are templates (width/height ratio) computed by clustering the dimensions of the ground truth boxes from the training dataset. The anchors then represent the average shape and size of the objects to detect. More details on the anchors boxes concept are given in this blog article.

Additional information about YOLO can be found on the Darknet website and source code for the preprocessing and postprocessing functions that are included in akida_models package (see the processing section in the model zoo) is largely inspired from experiencor github.

2. Preprocessing tools

As this example focuses on car and person detection only, a subset of VOC has been prepared with test images from VOC2007 that contains at least one of the occurence of the two classes. Just like the VOC dataset, the subset contains an image folder, an annotation folder and a text file listing the file names of interest.

The YOLO toolkit offers several methods to prepare data for processing, see load_image, preprocess_image or parse_voc_annotations.

import os

from tensorflow.keras.utils import get_file
from akida_models.detection.processing import parse_voc_annotations

# Download validation set from Brainchip data server
data_path = get_file(
    "voc_test_car_person.tar.gz",
    "http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz",
    cache_subdir='datasets/voc',
    extract=True)

data_dir = os.path.dirname(data_path)
gt_folder = os.path.join(data_dir, 'voc_test_car_person', 'Annotations')
image_folder = os.path.join(data_dir, 'voc_test_car_person', 'JPEGImages')
file_path = os.path.join(
    data_dir, 'voc_test_car_person', 'test_car_person.txt')
labels = ['car', 'person']

val_data = parse_voc_annotations(gt_folder, image_folder, file_path, labels)
print("Loaded VOC2007 test data for car and person classes: "
      f"{len(val_data)} images.")
Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tar.gz

     8192/221551911 [..............................] - ETA: 0s
   196608/221551911 [..............................] - ETA: 56s
   729088/221551911 [..............................] - ETA: 30s
  1302528/221551911 [..............................] - ETA: 25s
  1687552/221551911 [..............................] - ETA: 27s
  2097152/221551911 [..............................] - ETA: 28s
  2572288/221551911 [..............................] - ETA: 27s
  2973696/221551911 [..............................] - ETA: 27s
  3399680/221551911 [..............................] - ETA: 27s
  3825664/221551911 [..............................] - ETA: 27s
  4259840/221551911 [..............................] - ETA: 26s
  4694016/221551911 [..............................] - ETA: 26s
  5144576/221551911 [..............................] - ETA: 26s
  5603328/221551911 [..............................] - ETA: 26s
  6078464/221551911 [..............................] - ETA: 25s
  6545408/221551911 [..............................] - ETA: 25s
  7028736/221551911 [..............................] - ETA: 25s
  7520256/221551911 [>.............................] - ETA: 25s
  8003584/221551911 [>.............................] - ETA: 24s
  8511488/221551911 [>.............................] - ETA: 24s
  9019392/221551911 [>.............................] - ETA: 24s
  9510912/221551911 [>.............................] - ETA: 24s
 10010624/221551911 [>.............................] - ETA: 23s
 10526720/221551911 [>.............................] - ETA: 23s
 11042816/221551911 [>.............................] - ETA: 23s
 11542528/221551911 [>.............................] - ETA: 23s
 12066816/221551911 [>.............................] - ETA: 23s
 12599296/221551911 [>.............................] - ETA: 22s
 13123584/221551911 [>.............................] - ETA: 22s
 13656064/221551911 [>.............................] - ETA: 22s
 14188544/221551911 [>.............................] - ETA: 22s
 14737408/221551911 [>.............................] - ETA: 22s
 15286272/221551911 [=>............................] - ETA: 22s
 15835136/221551911 [=>............................] - ETA: 21s
 16400384/221551911 [=>............................] - ETA: 21s
 16949248/221551911 [=>............................] - ETA: 21s
 17522688/221551911 [=>............................] - ETA: 21s
 18096128/221551911 [=>............................] - ETA: 21s
 18669568/221551911 [=>............................] - ETA: 21s
 19251200/221551911 [=>............................] - ETA: 20s
 19824640/221551911 [=>............................] - ETA: 20s
 20389888/221551911 [=>............................] - ETA: 20s
 20963328/221551911 [=>............................] - ETA: 20s
 21544960/221551911 [=>............................] - ETA: 20s
 22110208/221551911 [=>............................] - ETA: 20s
 22683648/221551911 [==>...........................] - ETA: 20s
 23265280/221551911 [==>...........................] - ETA: 19s
 23838720/221551911 [==>...........................] - ETA: 19s
 24387584/221551911 [==>...........................] - ETA: 19s
 24494080/221551911 [==>...........................] - ETA: 21s
 24592384/221551911 [==>...........................] - ETA: 22s
 24625152/221551911 [==>...........................] - ETA: 22s
 24911872/221551911 [==>...........................] - ETA: 22s
 25157632/221551911 [==>...........................] - ETA: 22s
 25411584/221551911 [==>...........................] - ETA: 22s
 25665536/221551911 [==>...........................] - ETA: 23s
 25944064/221551911 [==>...........................] - ETA: 23s
 26222592/221551911 [==>...........................] - ETA: 23s
 26501120/221551911 [==>...........................] - ETA: 23s
 26804224/221551911 [==>...........................] - ETA: 23s
 27115520/221551911 [==>...........................] - ETA: 23s
 27426816/221551911 [==>...........................] - ETA: 23s
 27746304/221551911 [==>...........................] - ETA: 23s
 28073984/221551911 [==>...........................] - ETA: 23s
 28418048/221551911 [==>...........................] - ETA: 23s
 28753920/221551911 [==>...........................] - ETA: 23s
 29089792/221551911 [==>...........................] - ETA: 23s
 29450240/221551911 [==>...........................] - ETA: 23s
 29818880/221551911 [===>..........................] - ETA: 23s
 30203904/221551911 [===>..........................] - ETA: 23s
 30605312/221551911 [===>..........................] - ETA: 23s
 30990336/221551911 [===>..........................] - ETA: 23s
 31399936/221551911 [===>..........................] - ETA: 23s
 31809536/221551911 [===>..........................] - ETA: 23s
 32235520/221551911 [===>..........................] - ETA: 23s
 32661504/221551911 [===>..........................] - ETA: 23s
 33095680/221551911 [===>..........................] - ETA: 23s
 33529856/221551911 [===>..........................] - ETA: 23s
 33980416/221551911 [===>..........................] - ETA: 23s
 34455552/221551911 [===>..........................] - ETA: 22s
 34922496/221551911 [===>..........................] - ETA: 22s
 35397632/221551911 [===>..........................] - ETA: 22s
 35889152/221551911 [===>..........................] - ETA: 22s
 36372480/221551911 [===>..........................] - ETA: 22s
 36872192/221551911 [===>..........................] - ETA: 22s
 37380096/221551911 [====>.........................] - ETA: 22s
 37871616/221551911 [====>.........................] - ETA: 22s
 38379520/221551911 [====>.........................] - ETA: 22s
 38903808/221551911 [====>.........................] - ETA: 22s
 39444480/221551911 [====>.........................] - ETA: 21s
 39968768/221551911 [====>.........................] - ETA: 21s
 40493056/221551911 [====>.........................] - ETA: 21s
 41017344/221551911 [====>.........................] - ETA: 21s
 41566208/221551911 [====>.........................] - ETA: 21s
 42115072/221551911 [====>.........................] - ETA: 21s
 42663936/221551911 [====>.........................] - ETA: 21s
 43237376/221551911 [====>.........................] - ETA: 20s
 43794432/221551911 [====>.........................] - ETA: 20s
 44359680/221551911 [=====>........................] - ETA: 20s
 44933120/221551911 [=====>........................] - ETA: 20s
 45506560/221551911 [=====>........................] - ETA: 20s
 46088192/221551911 [=====>........................] - ETA: 20s
 46661632/221551911 [=====>........................] - ETA: 20s
 47218688/221551911 [=====>........................] - ETA: 20s
 47431680/221551911 [=====>........................] - ETA: 20s
 47898624/221551911 [=====>........................] - ETA: 20s
 48340992/221551911 [=====>........................] - ETA: 20s
 48824320/221551911 [=====>........................] - ETA: 19s
 49307648/221551911 [=====>........................] - ETA: 19s
 49758208/221551911 [=====>........................] - ETA: 19s
 50241536/221551911 [=====>........................] - ETA: 19s
 50741248/221551911 [=====>........................] - ETA: 19s
 51240960/221551911 [=====>........................] - ETA: 19s
 51740672/221551911 [======>.......................] - ETA: 19s
 52264960/221551911 [======>.......................] - ETA: 19s
 52772864/221551911 [======>.......................] - ETA: 19s
 53305344/221551911 [======>.......................] - ETA: 19s
 53788672/221551911 [======>.......................] - ETA: 19s
 54329344/221551911 [======>.......................] - ETA: 19s
 54861824/221551911 [======>.......................] - ETA: 18s
 55361536/221551911 [======>.......................] - ETA: 18s
 55926784/221551911 [======>.......................] - ETA: 18s
 56451072/221551911 [======>.......................] - ETA: 18s
 57016320/221551911 [======>.......................] - ETA: 18s
 57581568/221551911 [======>.......................] - ETA: 18s
 58155008/221551911 [======>.......................] - ETA: 18s
 58736640/221551911 [======>.......................] - ETA: 18s
 59326464/221551911 [=======>......................] - ETA: 18s
 59916288/221551911 [=======>......................] - ETA: 18s
 60309504/221551911 [=======>......................] - ETA: 18s
 60743680/221551911 [=======>......................] - ETA: 18s
 61202432/221551911 [=======>......................] - ETA: 17s
 61669376/221551911 [=======>......................] - ETA: 17s
 62128128/221551911 [=======>......................] - ETA: 17s
 62611456/221551911 [=======>......................] - ETA: 17s
 63111168/221551911 [=======>......................] - ETA: 17s
 63602688/221551911 [=======>......................] - ETA: 17s
 64094208/221551911 [=======>......................] - ETA: 17s
 64602112/221551911 [=======>......................] - ETA: 17s
 65093632/221551911 [=======>......................] - ETA: 17s
 65601536/221551911 [=======>......................] - ETA: 17s
 66125824/221551911 [=======>......................] - ETA: 17s
 66625536/221551911 [========>.....................] - ETA: 17s
 67133440/221551911 [========>.....................] - ETA: 17s
 67674112/221551911 [========>.....................] - ETA: 17s
 68214784/221551911 [========>.....................] - ETA: 17s
 68722688/221551911 [========>.....................] - ETA: 16s
 69246976/221551911 [========>.....................] - ETA: 16s
 69812224/221551911 [========>.....................] - ETA: 16s
 70361088/221551911 [========>.....................] - ETA: 16s
 70934528/221551911 [========>.....................] - ETA: 16s
 71516160/221551911 [========>.....................] - ETA: 16s
 72097792/221551911 [========>.....................] - ETA: 16s
 72679424/221551911 [========>.....................] - ETA: 16s
 73261056/221551911 [========>.....................] - ETA: 16s
 73834496/221551911 [========>.....................] - ETA: 16s
 74416128/221551911 [=========>....................] - ETA: 16s
 74997760/221551911 [=========>....................] - ETA: 15s
 75563008/221551911 [=========>....................] - ETA: 15s
 76152832/221551911 [=========>....................] - ETA: 15s
 76734464/221551911 [=========>....................] - ETA: 15s
 77225984/221551911 [=========>....................] - ETA: 15s
 77668352/221551911 [=========>....................] - ETA: 15s
 78151680/221551911 [=========>....................] - ETA: 15s
 78635008/221551911 [=========>....................] - ETA: 15s
 79134720/221551911 [=========>....................] - ETA: 15s
 79585280/221551911 [=========>....................] - ETA: 15s
 79855616/221551911 [=========>....................] - ETA: 15s
 80224256/221551911 [=========>....................] - ETA: 15s
 80584704/221551911 [=========>....................] - ETA: 15s
 80945152/221551911 [=========>....................] - ETA: 15s
 81313792/221551911 [==========>...................] - ETA: 15s
 81698816/221551911 [==========>...................] - ETA: 15s
 82059264/221551911 [==========>...................] - ETA: 15s
 82468864/221551911 [==========>...................] - ETA: 15s
 82878464/221551911 [==========>...................] - ETA: 15s
 83189760/221551911 [==========>...................] - ETA: 15s
 83427328/221551911 [==========>...................] - ETA: 15s
 83746816/221551911 [==========>...................] - ETA: 15s
 84058112/221551911 [==========>...................] - ETA: 15s
 84377600/221551911 [==========>...................] - ETA: 15s
 84721664/221551911 [==========>...................] - ETA: 15s
 85057536/221551911 [==========>...................] - ETA: 15s
 85417984/221551911 [==========>...................] - ETA: 15s
 85778432/221551911 [==========>...................] - ETA: 15s
 86147072/221551911 [==========>...................] - ETA: 15s
 86532096/221551911 [==========>...................] - ETA: 15s
 86908928/221551911 [==========>...................] - ETA: 15s
 87310336/221551911 [==========>...................] - ETA: 15s
 87719936/221551911 [==========>...................] - ETA: 14s
 88129536/221551911 [==========>...................] - ETA: 14s
 88547328/221551911 [==========>...................] - ETA: 14s
 88973312/221551911 [===========>..................] - ETA: 14s
 89399296/221551911 [===========>..................] - ETA: 14s
 89825280/221551911 [===========>..................] - ETA: 14s
 90284032/221551911 [===========>..................] - ETA: 14s
 90726400/221551911 [===========>..................] - ETA: 14s
 91185152/221551911 [===========>..................] - ETA: 14s
 91652096/221551911 [===========>..................] - ETA: 14s
 92135424/221551911 [===========>..................] - ETA: 14s
 92618752/221551911 [===========>..................] - ETA: 14s
 93102080/221551911 [===========>..................] - ETA: 14s
 93609984/221551911 [===========>..................] - ETA: 14s
 94101504/221551911 [===========>..................] - ETA: 14s
 94609408/221551911 [===========>..................] - ETA: 14s
 95125504/221551911 [===========>..................] - ETA: 14s
 95633408/221551911 [===========>..................] - ETA: 14s
 96165888/221551911 [============>.................] - ETA: 13s
 96706560/221551911 [============>.................] - ETA: 13s
 97255424/221551911 [============>.................] - ETA: 13s
 97796096/221551911 [============>.................] - ETA: 13s
 98336768/221551911 [============>.................] - ETA: 13s
 98770944/221551911 [============>.................] - ETA: 13s
 99205120/221551911 [============>.................] - ETA: 13s
 99622912/221551911 [============>.................] - ETA: 13s
100057088/221551911 [============>.................] - ETA: 13s
100491264/221551911 [============>.................] - ETA: 13s
100941824/221551911 [============>.................] - ETA: 13s
101392384/221551911 [============>.................] - ETA: 13s
101851136/221551911 [============>.................] - ETA: 13s
102318080/221551911 [============>.................] - ETA: 13s
102776832/221551911 [============>.................] - ETA: 13s
103219200/221551911 [============>.................] - ETA: 13s
103620608/221551911 [=============>................] - ETA: 13s
103989248/221551911 [=============>................] - ETA: 13s
104259584/221551911 [=============>................] - ETA: 13s
104595456/221551911 [=============>................] - ETA: 13s
104824832/221551911 [=============>................] - ETA: 13s
105095168/221551911 [=============>................] - ETA: 13s
105373696/221551911 [=============>................] - ETA: 13s
105660416/221551911 [=============>................] - ETA: 13s
105963520/221551911 [=============>................] - ETA: 13s
106258432/221551911 [=============>................] - ETA: 13s
106577920/221551911 [=============>................] - ETA: 13s
106889216/221551911 [=============>................] - ETA: 12s
107208704/221551911 [=============>................] - ETA: 12s
107552768/221551911 [=============>................] - ETA: 12s
107896832/221551911 [=============>................] - ETA: 12s
108257280/221551911 [=============>................] - ETA: 12s
108601344/221551911 [=============>................] - ETA: 12s
108986368/221551911 [=============>................] - ETA: 12s
109338624/221551911 [=============>................] - ETA: 12s
109723648/221551911 [=============>................] - ETA: 12s
110125056/221551911 [=============>................] - ETA: 12s
110526464/221551911 [=============>................] - ETA: 12s
110936064/221551911 [==============>...............] - ETA: 12s
111362048/221551911 [==============>...............] - ETA: 12s
111788032/221551911 [==============>...............] - ETA: 12s
112222208/221551911 [==============>...............] - ETA: 12s
112656384/221551911 [==============>...............] - ETA: 12s
113082368/221551911 [==============>...............] - ETA: 12s
113541120/221551911 [==============>...............] - ETA: 12s
113999872/221551911 [==============>...............] - ETA: 12s
114458624/221551911 [==============>...............] - ETA: 12s
114941952/221551911 [==============>...............] - ETA: 12s
115376128/221551911 [==============>...............] - ETA: 12s
115867648/221551911 [==============>...............] - ETA: 12s
116359168/221551911 [==============>...............] - ETA: 12s
116858880/221551911 [==============>...............] - ETA: 11s
117358592/221551911 [==============>...............] - ETA: 11s
117874688/221551911 [==============>...............] - ETA: 11s
118390784/221551911 [===============>..............] - ETA: 11s
118906880/221551911 [===============>..............] - ETA: 11s
119455744/221551911 [===============>..............] - ETA: 11s
119988224/221551911 [===============>..............] - ETA: 11s
120528896/221551911 [===============>..............] - ETA: 11s
121077760/221551911 [===============>..............] - ETA: 11s
121618432/221551911 [===============>..............] - ETA: 11s
122167296/221551911 [===============>..............] - ETA: 11s
122683392/221551911 [===============>..............] - ETA: 11s
123240448/221551911 [===============>..............] - ETA: 11s
123813888/221551911 [===============>..............] - ETA: 11s
124387328/221551911 [===============>..............] - ETA: 10s
124968960/221551911 [===============>..............] - ETA: 10s
125517824/221551911 [===============>..............] - ETA: 10s
126042112/221551911 [================>.............] - ETA: 10s
126394368/221551911 [================>.............] - ETA: 10s
126812160/221551911 [================>.............] - ETA: 10s
127262720/221551911 [================>.............] - ETA: 10s
127729664/221551911 [================>.............] - ETA: 10s
128196608/221551911 [================>.............] - ETA: 10s
128671744/221551911 [================>.............] - ETA: 10s
129138688/221551911 [================>.............] - ETA: 10s
129613824/221551911 [================>.............] - ETA: 10s
130105344/221551911 [================>.............] - ETA: 10s
130605056/221551911 [================>.............] - ETA: 10s
131121152/221551911 [================>.............] - ETA: 10s
131629056/221551911 [================>.............] - ETA: 10s
132128768/221551911 [================>.............] - ETA: 10s
132644864/221551911 [================>.............] - ETA: 9s 
133177344/221551911 [=================>............] - ETA: 9s
133709824/221551911 [=================>............] - ETA: 9s
134258688/221551911 [=================>............] - ETA: 9s
134799360/221551911 [=================>............] - ETA: 9s
135331840/221551911 [=================>............] - ETA: 9s
135888896/221551911 [=================>............] - ETA: 9s
136454144/221551911 [=================>............] - ETA: 9s
137019392/221551911 [=================>............] - ETA: 9s
137592832/221551911 [=================>............] - ETA: 9s
138158080/221551911 [=================>............] - ETA: 9s
138739712/221551911 [=================>............] - ETA: 9s
139313152/221551911 [=================>............] - ETA: 9s
139902976/221551911 [=================>............] - ETA: 9s
140476416/221551911 [==================>...........] - ETA: 9s
140771328/221551911 [==================>...........] - ETA: 8s
141221888/221551911 [==================>...........] - ETA: 8s
141680640/221551911 [==================>...........] - ETA: 8s
142139392/221551911 [==================>...........] - ETA: 8s
142622720/221551911 [==================>...........] - ETA: 8s
143097856/221551911 [==================>...........] - ETA: 8s
143581184/221551911 [==================>...........] - ETA: 8s
144072704/221551911 [==================>...........] - ETA: 8s
144572416/221551911 [==================>...........] - ETA: 8s
145088512/221551911 [==================>...........] - ETA: 8s
145604608/221551911 [==================>...........] - ETA: 8s
146128896/221551911 [==================>...........] - ETA: 8s
146677760/221551911 [==================>...........] - ETA: 8s
147218432/221551911 [==================>...........] - ETA: 8s
147759104/221551911 [===================>..........] - ETA: 8s
148250624/221551911 [===================>..........] - ETA: 8s
148791296/221551911 [===================>..........] - ETA: 8s
149340160/221551911 [===================>..........] - ETA: 7s
149905408/221551911 [===================>..........] - ETA: 7s
150470656/221551911 [===================>..........] - ETA: 7s
151035904/221551911 [===================>..........] - ETA: 7s
151396352/221551911 [===================>..........] - ETA: 7s
151732224/221551911 [===================>..........] - ETA: 7s
152182784/221551911 [===================>..........] - ETA: 7s
152633344/221551911 [===================>..........] - ETA: 7s
153092096/221551911 [===================>..........] - ETA: 7s
153559040/221551911 [===================>..........] - ETA: 7s
154025984/221551911 [===================>..........] - ETA: 7s
154484736/221551911 [===================>..........] - ETA: 7s
154976256/221551911 [===================>..........] - ETA: 7s
155467776/221551911 [====================>.........] - ETA: 7s
155967488/221551911 [====================>.........] - ETA: 7s
156491776/221551911 [====================>.........] - ETA: 7s
156999680/221551911 [====================>.........] - ETA: 7s
157523968/221551911 [====================>.........] - ETA: 7s
158048256/221551911 [====================>.........] - ETA: 6s
158580736/221551911 [====================>.........] - ETA: 6s
159113216/221551911 [====================>.........] - ETA: 6s
159596544/221551911 [====================>.........] - ETA: 6s
159793152/221551911 [====================>.........] - ETA: 6s
160235520/221551911 [====================>.........] - ETA: 6s
160653312/221551911 [====================>.........] - ETA: 6s
161087488/221551911 [====================>.........] - ETA: 6s
161513472/221551911 [====================>.........] - ETA: 6s
161947648/221551911 [====================>.........] - ETA: 6s
162390016/221551911 [====================>.........] - ETA: 6s
162832384/221551911 [=====================>........] - ETA: 6s
163299328/221551911 [=====================>........] - ETA: 6s
163766272/221551911 [=====================>........] - ETA: 6s
164233216/221551911 [=====================>........] - ETA: 6s
164708352/221551911 [=====================>........] - ETA: 6s
165191680/221551911 [=====================>........] - ETA: 6s
165658624/221551911 [=====================>........] - ETA: 6s
166125568/221551911 [=====================>........] - ETA: 6s
166387712/221551911 [=====================>........] - ETA: 6s
166830080/221551911 [=====================>........] - ETA: 6s
167190528/221551911 [=====================>........] - ETA: 6s
167575552/221551911 [=====================>........] - ETA: 5s
167960576/221551911 [=====================>........] - ETA: 5s
168345600/221551911 [=====================>........] - ETA: 5s
168755200/221551911 [=====================>........] - ETA: 5s
169156608/221551911 [=====================>........] - ETA: 5s
169582592/221551911 [=====================>........] - ETA: 5s
170016768/221551911 [======================>.......] - ETA: 5s
170450944/221551911 [======================>.......] - ETA: 5s
170876928/221551911 [======================>.......] - ETA: 5s
171294720/221551911 [======================>.......] - ETA: 5s
171753472/221551911 [======================>.......] - ETA: 5s
172212224/221551911 [======================>.......] - ETA: 5s
172679168/221551911 [======================>.......] - ETA: 5s
173162496/221551911 [======================>.......] - ETA: 5s
173645824/221551911 [======================>.......] - ETA: 5s
174137344/221551911 [======================>.......] - ETA: 5s
174620672/221551911 [======================>.......] - ETA: 5s
175095808/221551911 [======================>.......] - ETA: 5s
175587328/221551911 [======================>.......] - ETA: 5s
176095232/221551911 [======================>.......] - ETA: 5s
176619520/221551911 [======================>.......] - ETA: 4s
177127424/221551911 [======================>.......] - ETA: 4s
177602560/221551911 [=======================>......] - ETA: 4s
178143232/221551911 [=======================>......] - ETA: 4s
178700288/221551911 [=======================>......] - ETA: 4s
179249152/221551911 [=======================>......] - ETA: 4s
179806208/221551911 [=======================>......] - ETA: 4s
180371456/221551911 [=======================>......] - ETA: 4s
180936704/221551911 [=======================>......] - ETA: 4s
181510144/221551911 [=======================>......] - ETA: 4s
182083584/221551911 [=======================>......] - ETA: 4s
182657024/221551911 [=======================>......] - ETA: 4s
183238656/221551911 [=======================>......] - ETA: 4s
183828480/221551911 [=======================>......] - ETA: 4s
184410112/221551911 [=======================>......] - ETA: 4s
184999936/221551911 [========================>.....] - ETA: 4s
185581568/221551911 [========================>.....] - ETA: 3s
186146816/221551911 [========================>.....] - ETA: 3s
186728448/221551911 [========================>.....] - ETA: 3s
187318272/221551911 [========================>.....] - ETA: 3s
187899904/221551911 [========================>.....] - ETA: 3s
188473344/221551911 [========================>.....] - ETA: 3s
189054976/221551911 [========================>.....] - ETA: 3s
189644800/221551911 [========================>.....] - ETA: 3s
190226432/221551911 [========================>.....] - ETA: 3s
190799872/221551911 [========================>.....] - ETA: 3s
191389696/221551911 [========================>.....] - ETA: 3s
191979520/221551911 [========================>.....] - ETA: 3s
192561152/221551911 [=========================>....] - ETA: 3s
193142784/221551911 [=========================>....] - ETA: 3s
193724416/221551911 [=========================>....] - ETA: 3s
194314240/221551911 [=========================>....] - ETA: 2s
194797568/221551911 [=========================>....] - ETA: 2s
195248128/221551911 [=========================>....] - ETA: 2s
195649536/221551911 [=========================>....] - ETA: 2s
196042752/221551911 [=========================>....] - ETA: 2s
196435968/221551911 [=========================>....] - ETA: 2s
196829184/221551911 [=========================>....] - ETA: 2s
197238784/221551911 [=========================>....] - ETA: 2s
197664768/221551911 [=========================>....] - ETA: 2s
198107136/221551911 [=========================>....] - ETA: 2s
198533120/221551911 [=========================>....] - ETA: 2s
198983680/221551911 [=========================>....] - ETA: 2s
199409664/221551911 [==========================>...] - ETA: 2s
199663616/221551911 [==========================>...] - ETA: 2s
200015872/221551911 [==========================>...] - ETA: 2s
200368128/221551911 [==========================>...] - ETA: 2s
200704000/221551911 [==========================>...] - ETA: 2s
201056256/221551911 [==========================>...] - ETA: 2s
201424896/221551911 [==========================>...] - ETA: 2s
201801728/221551911 [==========================>...] - ETA: 2s
202186752/221551911 [==========================>...] - ETA: 2s
202571776/221551911 [==========================>...] - ETA: 2s
202981376/221551911 [==========================>...] - ETA: 2s
203382784/221551911 [==========================>...] - ETA: 1s
203800576/221551911 [==========================>...] - ETA: 1s
204226560/221551911 [==========================>...] - ETA: 1s
204660736/221551911 [==========================>...] - ETA: 1s
205094912/221551911 [==========================>...] - ETA: 1s
205545472/221551911 [==========================>...] - ETA: 1s
205996032/221551911 [==========================>...] - ETA: 1s
206454784/221551911 [==========================>...] - ETA: 1s
206921728/221551911 [===========================>..] - ETA: 1s
207405056/221551911 [===========================>..] - ETA: 1s
207888384/221551911 [===========================>..] - ETA: 1s
208363520/221551911 [===========================>..] - ETA: 1s
208855040/221551911 [===========================>..] - ETA: 1s
209371136/221551911 [===========================>..] - ETA: 1s
209887232/221551911 [===========================>..] - ETA: 1s
210411520/221551911 [===========================>..] - ETA: 1s
210935808/221551911 [===========================>..] - ETA: 1s
211468288/221551911 [===========================>..] - ETA: 1s
211976192/221551911 [===========================>..] - ETA: 1s
212525056/221551911 [===========================>..] - ETA: 0s
213073920/221551911 [===========================>..] - ETA: 0s
213606400/221551911 [===========================>..] - ETA: 0s
214130688/221551911 [===========================>..] - ETA: 0s
214360064/221551911 [============================>.] - ETA: 0s
214761472/221551911 [============================>.] - ETA: 0s
215187456/221551911 [============================>.] - ETA: 0s
215613440/221551911 [============================>.] - ETA: 0s
216047616/221551911 [============================>.] - ETA: 0s
216498176/221551911 [============================>.] - ETA: 0s
216940544/221551911 [============================>.] - ETA: 0s
217399296/221551911 [============================>.] - ETA: 0s
217833472/221551911 [============================>.] - ETA: 0s
218308608/221551911 [============================>.] - ETA: 0s
218775552/221551911 [============================>.] - ETA: 0s
219267072/221551911 [============================>.] - ETA: 0s
219758592/221551911 [============================>.] - ETA: 0s
220258304/221551911 [============================>.] - ETA: 0s
220758016/221551911 [============================>.] - ETA: 0s
220987392/221551911 [============================>.] - ETA: 0s
221233152/221551911 [============================>.] - ETA: 0s
221274112/221551911 [============================>.] - ETA: 0s
221331456/221551911 [============================>.] - ETA: 0s
221364224/221551911 [============================>.] - ETA: 0s
221421568/221551911 [============================>.] - ETA: 0s
221462528/221551911 [============================>.] - ETA: 0s
221511680/221551911 [============================>.] - ETA: 0s
221551911/221551911 [==============================] - 25s 0us/step
Loaded VOC2007 test data for car and person classes: 2500 images.

Anchors can also be computed easily using YOLO toolkit.

Note

The following code is given as an example. In a real use case scenario, anchors are computed on the training dataset.

from akida_models.detection.generate_anchors import generate_anchors

num_anchors = 5
grid_size = (7, 7)
anchors_example = generate_anchors(val_data, num_anchors, grid_size)
Average IOU for 5 anchors: 0.62
Anchors:  [[0.54732, 1.13831], [1.40199, 1.97749], [1.49667, 4.1528], [2.99261, 4.73019], [5.53286, 5.46107]]

3. Model architecture

The model zoo contains a YOLO model that is built upon the AkidaNet architecture and 3 separable convolutional layers at the top for bounding box and class estimation followed by a final separable convolutional which is the detection layer. Note that for efficiency, the alpha parameter in AkidaNet (network width or number of filter in each layer) is set to 0.5.

from akida_models import yolo_base

# Create a yolo model for 2 classes with 5 anchors and grid size of 7
classes = 2

model = yolo_base(input_shape=(224, 224, 3),
                  classes=classes,
                  nb_box=num_anchors,
                  alpha=0.5)
model.summary()
Model: "yolo_base"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input (InputLayer)          [(None, 224, 224, 3)]     0

 rescaling (Rescaling)       (None, 224, 224, 3)       0

 conv_0 (Conv2D)             (None, 112, 112, 16)      432

 conv_0/BN (BatchNormalizati  (None, 112, 112, 16)     64
 on)

 conv_0/relu (ReLU)          (None, 112, 112, 16)      0

 conv_1 (Conv2D)             (None, 112, 112, 32)      4608

 conv_1/BN (BatchNormalizati  (None, 112, 112, 32)     128
 on)

 conv_1/relu (ReLU)          (None, 112, 112, 32)      0

 conv_2 (Conv2D)             (None, 56, 56, 64)        18432

 conv_2/BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_2/relu (ReLU)          (None, 56, 56, 64)        0

 conv_3 (Conv2D)             (None, 56, 56, 64)        36864

 conv_3/BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_3/relu (ReLU)          (None, 56, 56, 64)        0

 separable_4 (SeparableConv2  (None, 28, 28, 128)      8768
 D)

 separable_4/BN (BatchNormal  (None, 28, 28, 128)      512
 ization)

 separable_4/relu (ReLU)     (None, 28, 28, 128)       0

 separable_5 (SeparableConv2  (None, 28, 28, 128)      17536
 D)

 separable_5/BN (BatchNormal  (None, 28, 28, 128)      512
 ization)

 separable_5/relu (ReLU)     (None, 28, 28, 128)       0

 separable_6 (SeparableConv2  (None, 14, 14, 256)      33920
 D)

 separable_6/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_6/relu (ReLU)     (None, 14, 14, 256)       0

 separable_7 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_7/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_7/relu (ReLU)     (None, 14, 14, 256)       0

 separable_8 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_8/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_8/relu (ReLU)     (None, 14, 14, 256)       0

 separable_9 (SeparableConv2  (None, 14, 14, 256)      67840
 D)

 separable_9/BN (BatchNormal  (None, 14, 14, 256)      1024
 ization)

 separable_9/relu (ReLU)     (None, 14, 14, 256)       0

 separable_10 (SeparableConv  (None, 14, 14, 256)      67840
 2D)

 separable_10/BN (BatchNorma  (None, 14, 14, 256)      1024
 lization)

 separable_10/relu (ReLU)    (None, 14, 14, 256)       0

 separable_11 (SeparableConv  (None, 14, 14, 256)      67840
 2D)

 separable_11/BN (BatchNorma  (None, 14, 14, 256)      1024
 lization)

 separable_11/relu (ReLU)    (None, 14, 14, 256)       0

 separable_12 (SeparableConv  (None, 7, 7, 512)        133376
 2D)

 separable_12/BN (BatchNorma  (None, 7, 7, 512)        2048
 lization)

 separable_12/relu (ReLU)    (None, 7, 7, 512)         0

 separable_13 (SeparableConv  (None, 7, 7, 512)        266752
 2D)

 separable_13/BN (BatchNorma  (None, 7, 7, 512)        2048
 lization)

 separable_13/relu (ReLU)    (None, 7, 7, 512)         0

 1conv (SeparableConv2D)     (None, 7, 7, 1024)        528896

 1conv/BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 1conv/relu (ReLU)           (None, 7, 7, 1024)        0

 2conv (SeparableConv2D)     (None, 7, 7, 1024)        1057792

 2conv/BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 2conv/relu (ReLU)           (None, 7, 7, 1024)        0

 3conv (SeparableConv2D)     (None, 7, 7, 1024)        1057792

 3conv/BN (BatchNormalizatio  (None, 7, 7, 1024)       4096
 n)

 3conv/relu (ReLU)           (None, 7, 7, 1024)        0

 detection_layer (SeparableC  (None, 7, 7, 35)         45091
 onv2D)

=================================================================
Total params: 3,573,715
Trainable params: 3,561,587
Non-trainable params: 12,128
_________________________________________________________________

The model output can be reshaped to a more natural shape of:

(grid_height, grid_width, anchors_box, 4 + 1 + num_classes)

where the “4 + 1” term represents the coordinates of the estimated bounding boxes (top left x, top left y, width and height) and a confidence score. In other words, the output channels are actually grouped by anchor boxes, and in each group one channel provides either a coordinate, a global confidence score or a class confidence score. This process is done automatically in the decode_output function.

from tensorflow.keras import Model
from tensorflow.keras.layers import Reshape

# Define a reshape output to be added to the YOLO model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model.output)

# Build the complete model
full_model = Model(model.input, output)
full_model.output
<KerasTensor: shape=(None, 7, 7, 5, 7) dtype=float32 (created by layer 'YOLO_output')>

4. Training

As the YOLO model relies on Brainchip AkidaNet/ImageNet network, it is possible to perform transfer learning from ImageNet pretrained weights when training a YOLO model. See the PlantVillage transfer learning example for a detail explanation on transfer learning principles.

When using transfer learning for YOLO training, we advise to proceed in several steps that include model calibration:

  • instantiate the yolo_base model and load AkidaNet/ImageNet pretrained float weights,

akida_models create -s yolo_akidanet_voc.h5 yolo_base --classes 2 \
         --base_weights akidanet_imagenet_224_alpha_50.h5
  • freeze the AkidaNet layers and perform training,

yolo_train train -d voc_preprocessed.pkl -m yolo_akidanet_voc.h5 \
    -ap voc_anchors.pkl -e 25 -fb 1conv -s yolo_akidanet_voc.h5
  • quantize the network, create data for calibration and calibrate,

cnn2snn quantize -m yolo_akidanet_voc.h5 -iq 8 -wq 4 -aq 4
yolo_train extract -d voc_preprocessed.pkl -ap voc_anchors.pkl -b 1024 -o voc_samples.npz \
    -m yolo_akidanet_voc_iq8_wq4_aq4.h5
cnn2snn calibrate adaround -sa voc_samples.npz -b 128 -e 500 -lr 1e-3 \
    -m yolo_akidanet_voc_iq8_wq4_aq4.h5
  • tune the model to recover accuracy.

yolo_train tune -d voc_preprocessed.pkl \
    -m yolo_akidanet_voc_iq8_wq4_aq4_adaround_calibrated.h5 -ap voc_anchors.pkl \
    -e 10 -s yolo_akidanet_voc_iq8_wq4_aq4.h5

Note

  • voc_anchors.pkl is obtained saving the output of the generate_anchors call to a pickle file,

  • voc_preprocessed.pkl is obtained saving training data, validation data (obtained using parse_voc_annotations) and labels list (i.e [“car”, “person”]) into a pickle file.

Even if transfer learning should be the preferred way to train a YOLO model, it has been observed that for some datasets training all layers from scratch gives better results. That is the case for our YOLO WiderFace model to detect faces. In such a case, the training pipeline to follow is described in the typical training scenario.

5. Performance

The model zoo also contains an helper method that allows to create a YOLO model for VOC and load pretrained weights for the car and person detection task and the corresponding anchors. The anchors are used to interpret the model outputs.

The metric used to evaluate YOLO is the mean average precision (mAP) which is the percentage of correct prediction and is given for an intersection over union (IoU) ratio. Scores in this example are given for the standard IoU of 0.5 meaning that a detection is considered valid if the intersection over union ratio with its ground truth equivalent is above 0.5.

Note

A call to evaluate_map will preprocess the images, make the call to Model.predict and use decode_output before computing precision for all classes.

Reported performanced for all training steps are as follows:

Float

8/4/4 Calibrated

8/4/4 Tuned

Global mAP

38.38 %

32.88 %

38.83 %

from timeit import default_timer as timer
from akida_models import yolo_voc_pretrained
from akida_models.detection.map_evaluation import MapEvaluation

# Load the pretrained model along with anchors
model_keras, anchors = yolo_voc_pretrained()

# Define the final reshape and build the model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model_keras.output)
model_keras = Model(model_keras.input, output)

# Create the mAP evaluator object
num_images = 100

map_evaluator = MapEvaluation(model_keras, val_data[:num_images], labels,
                              anchors)

# Compute the scores for all validation images
start = timer()
mAP, average_precisions = map_evaluator.evaluate_map()
end = timer()

for label, average_precision in average_precisions.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP))
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')
Downloading data from http://data.brainchip.com/dataset-mirror/voc/voc_anchors.pkl.

  0/126 [..............................] - ETA: 0s
126/126 [==============================] - 0s 2us/step
Downloading data from http://data.brainchip.com/models/yolo/yolo_akidanet_voc_iq8_wq4_aq4.h5.

       0/14327256 [..............................] - ETA: 0s
  180224/14327256 [..............................] - ETA: 3s
  589824/14327256 [>.............................] - ETA: 2s
 1024000/14327256 [=>............................] - ETA: 1s
 1482752/14327256 [==>...........................] - ETA: 1s
 1941504/14327256 [===>..........................] - ETA: 1s
 2400256/14327256 [====>.........................] - ETA: 1s
 2899968/14327256 [=====>........................] - ETA: 1s
 3366912/14327256 [======>.......................] - ETA: 1s
 3858432/14327256 [=======>......................] - ETA: 1s
 4366336/14327256 [========>.....................] - ETA: 1s
 4866048/14327256 [=========>....................] - ETA: 1s
 5357568/14327256 [==========>...................] - ETA: 1s
 5881856/14327256 [===========>..................] - ETA: 0s
 6422528/14327256 [============>.................] - ETA: 0s
 6955008/14327256 [=============>................] - ETA: 0s
 7512064/14327256 [==============>...............] - ETA: 0s
 8077312/14327256 [===============>..............] - ETA: 0s
 8650752/14327256 [=================>............] - ETA: 0s
 9199616/14327256 [==================>...........] - ETA: 0s
 9748480/14327256 [===================>..........] - ETA: 0s
10330112/14327256 [====================>.........] - ETA: 0s
10919936/14327256 [=====================>........] - ETA: 0s
11501568/14327256 [=======================>......] - ETA: 0s
12091392/14327256 [========================>.....] - ETA: 0s
12656640/14327256 [=========================>....] - ETA: 0s
13189120/14327256 [==========================>...] - ETA: 0s
13688832/14327256 [===========================>..] - ETA: 0s
14147584/14327256 [============================>.] - ETA: 0s
14327256/14327256 [==============================] - 1s 0us/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 1s 878ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 17ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 17ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 17ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 17ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 17ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 16ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step

1/1 [==============================] - ETA: 0s
1/1 [==============================] - 0s 15ms/step
car 0.3777
person 0.3665
mAP: 0.3721
Keras inference on 100 images took 4.99 s.

6. Conversion to Akida

6.1 Convert to Akida model

Check model compatibility before akida conversion

from cnn2snn import check_model_compatibility

compat = check_model_compatibility(model_keras, False)
The Keras quantized model is not compatible for a conversion to an Akida model:
 The Reshape layer YOLO_output can only be used to transform a tensor of shape (N,) to a tensor of shape (1, 1, N), and vice-versa. Receives input_shape (7, 7, 35) and output_shape (7, 7, 5, 7).

The last YOLO_output layer that was added for splitting channels into values for each box must be removed before akida conversion.

# Rebuild a model without the last layer
compatible_model = Model(model_keras.input, model_keras.layers[-2].output)

When converting to an Akida model, we just need to pass the Keras model and the input scaling that was used during training to cnn2snn.convert. In YOLO preprocess_image function, images are zero centered and normalized between [-1, 1] hence the scaling values.

from cnn2snn import convert

model_akida = convert(compatible_model)
model_akida.summary()
                 Model Summary
________________________________________________
Input shape    Output shape  Sequences  Layers
================================================
[224, 224, 3]  [7, 7, 35]    1          18
________________________________________________

_________________________________________________________________
Layer (type)                 Output shape    Kernel shape

============== SW/conv_0-detection_layer (Software) =============

conv_0 (InputConv.)          [112, 112, 16]  (3, 3, 3, 16)
_________________________________________________________________
conv_1 (Conv.)               [112, 112, 32]  (3, 3, 16, 32)
_________________________________________________________________
conv_2 (Conv.)               [56, 56, 64]    (3, 3, 32, 64)
_________________________________________________________________
conv_3 (Conv.)               [56, 56, 64]    (3, 3, 64, 64)
_________________________________________________________________
separable_4 (Sep.Conv.)      [28, 28, 128]   (3, 3, 64, 1)
_________________________________________________________________
                                             (1, 1, 64, 128)
_________________________________________________________________
separable_5 (Sep.Conv.)      [28, 28, 128]   (3, 3, 128, 1)
_________________________________________________________________
                                             (1, 1, 128, 128)
_________________________________________________________________
separable_6 (Sep.Conv.)      [14, 14, 256]   (3, 3, 128, 1)
_________________________________________________________________
                                             (1, 1, 128, 256)
_________________________________________________________________
separable_7 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_8 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_9 (Sep.Conv.)      [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_10 (Sep.Conv.)     [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_11 (Sep.Conv.)     [14, 14, 256]   (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 256)
_________________________________________________________________
separable_12 (Sep.Conv.)     [7, 7, 512]     (3, 3, 256, 1)
_________________________________________________________________
                                             (1, 1, 256, 512)
_________________________________________________________________
separable_13 (Sep.Conv.)     [7, 7, 512]     (3, 3, 512, 1)
_________________________________________________________________
                                             (1, 1, 512, 512)
_________________________________________________________________
1conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 512, 1)
_________________________________________________________________
                                             (1, 1, 512, 1024)
_________________________________________________________________
2conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 1024)
_________________________________________________________________
3conv (Sep.Conv.)            [7, 7, 1024]    (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 1024)
_________________________________________________________________
detection_layer (Sep.Conv.)  [7, 7, 35]      (3, 3, 1024, 1)
_________________________________________________________________
                                             (1, 1, 1024, 35)
_________________________________________________________________

6.1 Check performance

Akida model accuracy is tested on the first n images of the validation set.

The table below summarizes the expected results:

#Images

Keras mAP

Akida mAP

100

38.80 %

34.26 %

1000

40.11 %

39.35 %

2500

38.83 %

38.85 %

# Create the mAP evaluator object
map_evaluator_ak = MapEvaluation(model_akida,
                                 val_data[:num_images],
                                 labels,
                                 anchors,
                                 is_keras_model=False)

# Compute the scores for all validation images
start = timer()
mAP_ak, average_precisions_ak = map_evaluator_ak.evaluate_map()
end = timer()

for label, average_precision in average_precisions_ak.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP_ak))
print(f'Akida inference on {num_images} images took {end-start:.2f} s.\n')
car 0.3789
person 0.3336
mAP: 0.3562
Akida inference on 100 images took 14.85 s.

6.2 Show predictions for a random image

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches

from akida_models.detection.processing import load_image, preprocess_image, decode_output

# Take a random test image
i = np.random.randint(len(val_data))

input_shape = model_akida.layers[0].input_dims

# Load the image
raw_image = load_image(val_data[i]['image_path'])

# Keep the original image size for later bounding boxes rescaling
raw_height, raw_width, _ = raw_image.shape

# Pre-process the image
image = preprocess_image(raw_image, input_shape)
input_image = image[np.newaxis, :].astype(np.uint8)

# Call evaluate on the image
pots = model_akida.predict(input_image)[0]

# Reshape the potentials to prepare for decoding
h, w, c = pots.shape
pots = pots.reshape((h, w, len(anchors), 4 + 1 + len(labels)))

# Decode potentials into bounding boxes
raw_boxes = decode_output(pots, anchors, len(labels))

# Rescale boxes to the original image size
pred_boxes = np.array([[
    box.x1 * raw_width, box.y1 * raw_height, box.x2 * raw_width,
    box.y2 * raw_height,
    box.get_label(),
    box.get_score()
] for box in raw_boxes])

fig = plt.figure(num='VOC2012 car and person detection by Akida runtime')
ax = fig.subplots(1)
img_plot = ax.imshow(np.zeros(raw_image.shape, dtype=np.uint8))
img_plot.set_data(raw_image)

for box in pred_boxes:
    rect = patches.Rectangle((box[0], box[1]),
                             box[2] - box[0],
                             box[3] - box[1],
                             linewidth=1,
                             edgecolor='r',
                             facecolor='none')
    ax.add_patch(rect)
    class_score = ax.text(box[0],
                          box[1] - 5,
                          f"{labels[int(box[4])]} - {box[5]:.2f}",
                          color='red')

plt.axis('off')
plt.show()
plot 5 voc yolo detection

Total running time of the script: ( 1 minutes 1.606 seconds)

Gallery generated by Sphinx-Gallery