Note
Go to the end to download the full example code
YOLO/PASCAL-VOC detection tutorial
This tutorial demonstrates that Akida can perform object detection. This is illustrated using a subset of the PASCAL-VOC 2007 dataset with “car” and “person” classes only. The YOLOv2 architecture from Redmon et al (2016) has been chosen to tackle this object detection problem.
1. Introduction
1.1 Object detection
Object detection is a computer vision task that combines two elemental tasks:
object classification that consists in assigning a class label to an image like shown in the AkidaNet/ImageNet inference example
object localization that consists of drawing a bounding box around one or several objects in an image
One can learn more about the subject by reading this introduction to object detection blog article.
1.2 YOLO key concepts
You Only Look Once (YOLO) is a deep neural network architecture dedicated to object detection.
As opposed to classic networks that handle object detection, YOLO predicts bounding boxes (localization task) and class probabilities (classification task) from a single neural network in a single evaluation. The object detection task is reduced to a regression problem to spatially separated boxes and associated class probabilities.
YOLO base concept is to divide an input image into regions, forming a grid, and to predict bounding boxes and probabilities for each region. The bounding boxes are weighted by the prediction probabilities.
YOLO also uses the concept of “anchors boxes” or “prior boxes”. The network does not actually predict the actual bounding boxes but offsets from anchors boxes which are templates (width/height ratio) computed by clustering the dimensions of the ground truth boxes from the training dataset. The anchors then represent the average shape and size of the objects to detect. More details on the anchors boxes concept are given in this blog article.
Additional information about YOLO can be found on the Darknet website and source code for the preprocessing and postprocessing functions that are included in akida_models package (see the processing section in the model zoo) is largely inspired from experiencor github.
2. Preprocessing tools
As this example focuses on car and person detection only, a subset of VOC has been prepared with test images from VOC2007 that contains at least one of the two classes. The dataset is represented as a tfrecord file, containing images, labels, and bounding boxes.
The load_tf_dataset function is a helper function that facilitates the loading and parsing of the tfrecord file.
The YOLO toolkit offers several methods to prepare data for processing, see load_image, preprocess_image.
import tensorflow as tf
from akida_models import fetch_file
# Download TFrecords test set from Brainchip data server
data_path = fetch_file(
fname="voc_test_car_person.tfrecord",
origin="https://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tfrecord",
cache_subdir='datasets/voc',
extract=True)
# Helper function to load and parse the Tfrecord file.
def load_tf_dataset(tf_record_file_path):
tfrecord_files = [tf_record_file_path]
# Feature description for parsing the TFRecord
feature_description = {
'image': tf.io.FixedLenFeature([], tf.string),
'objects/bbox': tf.io.VarLenFeature(tf.float32),
'objects/label': tf.io.VarLenFeature(tf.int64),
}
def _count_tfrecord_examples(dataset):
return len(list(dataset.as_numpy_iterator()))
def _parse_tfrecord_fn(example_proto):
example = tf.io.parse_single_example(example_proto, feature_description)
# Decode the image from bytes
example['image'] = tf.io.decode_jpeg(example['image'], channels=3)
# Convert the VarLenFeature to a dense tensor
example['objects/label'] = tf.sparse.to_dense(example['objects/label'], default_value=0)
example['objects/bbox'] = tf.sparse.to_dense(example['objects/bbox'])
# Boxes were flattenned that's why we need to reshape them
example['objects/bbox'] = tf.reshape(example['objects/bbox'],
(tf.shape(example['objects/label'])[0], 4))
# Create a new dictionary structure
objects = {
'label': example['objects/label'],
'bbox': example['objects/bbox'],
}
# Remove unnecessary keys
example.pop('objects/label')
example.pop('objects/bbox')
# Add 'objects' key to the main dictionary
example['objects'] = objects
return example
# Create a TFRecordDataset
dataset = tf.data.TFRecordDataset(tfrecord_files)
len_dataset = _count_tfrecord_examples(dataset)
parsed_dataset = dataset.map(_parse_tfrecord_fn)
return parsed_dataset, len_dataset
labels = ['car', 'person']
val_dataset, len_val_dataset = load_tf_dataset(data_path)
print("Loaded VOC2007 test data for car and person classes: "
f"{len_val_dataset} images.")
Downloading data from https://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tfrecord.
0/220193953 [..............................] - ETA: 0s
106496/220193953 [..............................] - ETA: 1:44
598016/220193953 [..............................] - ETA: 37s
1187840/220193953 [..............................] - ETA: 28s
1769472/220193953 [..............................] - ETA: 25s
2367488/220193953 [..............................] - ETA: 23s
2957312/220193953 [..............................] - ETA: 22s
3538944/220193953 [..............................] - ETA: 21s
4128768/220193953 [..............................] - ETA: 21s
4718592/220193953 [..............................] - ETA: 20s
5300224/220193953 [..............................] - ETA: 20s
5873664/220193953 [..............................] - ETA: 20s
6471680/220193953 [..............................] - ETA: 20s
7045120/220193953 [..............................] - ETA: 19s
7643136/220193953 [>.............................] - ETA: 19s
8224768/220193953 [>.............................] - ETA: 19s
8790016/220193953 [>.............................] - ETA: 19s
9379840/220193953 [>.............................] - ETA: 19s
9969664/220193953 [>.............................] - ETA: 19s
10559488/220193953 [>.............................] - ETA: 19s
11116544/220193953 [>.............................] - ETA: 19s
11673600/220193953 [>.............................] - ETA: 18s
12230656/220193953 [>.............................] - ETA: 18s
12787712/220193953 [>.............................] - ETA: 18s
13328384/220193953 [>.............................] - ETA: 18s
13869056/220193953 [>.............................] - ETA: 18s
14426112/220193953 [>.............................] - ETA: 18s
14983168/220193953 [=>............................] - ETA: 18s
15540224/220193953 [=>............................] - ETA: 18s
16097280/220193953 [=>............................] - ETA: 18s
16670720/220193953 [=>............................] - ETA: 18s
17244160/220193953 [=>............................] - ETA: 18s
17833984/220193953 [=>............................] - ETA: 18s
18423808/220193953 [=>............................] - ETA: 18s
19013632/220193953 [=>............................] - ETA: 18s
19603456/220193953 [=>............................] - ETA: 18s
19832832/220193953 [=>............................] - ETA: 19s
20701184/220193953 [=>............................] - ETA: 19s
21995520/220193953 [=>............................] - ETA: 18s
22585344/220193953 [==>...........................] - ETA: 18s
23175168/220193953 [==>...........................] - ETA: 18s
23764992/220193953 [==>...........................] - ETA: 18s
24354816/220193953 [==>...........................] - ETA: 18s
24944640/220193953 [==>...........................] - ETA: 18s
25534464/220193953 [==>...........................] - ETA: 17s
26124288/220193953 [==>...........................] - ETA: 17s
26714112/220193953 [==>...........................] - ETA: 17s
27303936/220193953 [==>...........................] - ETA: 17s
27893760/220193953 [==>...........................] - ETA: 17s
28483584/220193953 [==>...........................] - ETA: 17s
29057024/220193953 [==>...........................] - ETA: 17s
29646848/220193953 [===>..........................] - ETA: 17s
30236672/220193953 [===>..........................] - ETA: 17s
30826496/220193953 [===>..........................] - ETA: 17s
31416320/220193953 [===>..........................] - ETA: 17s
32006144/220193953 [===>..........................] - ETA: 17s
32595968/220193953 [===>..........................] - ETA: 17s
33185792/220193953 [===>..........................] - ETA: 16s
33775616/220193953 [===>..........................] - ETA: 16s
34349056/220193953 [===>..........................] - ETA: 16s
34938880/220193953 [===>..........................] - ETA: 16s
35528704/220193953 [===>..........................] - ETA: 16s
36118528/220193953 [===>..........................] - ETA: 16s
36675584/220193953 [===>..........................] - ETA: 16s
37265408/220193953 [====>.........................] - ETA: 16s
37855232/220193953 [====>.........................] - ETA: 16s
38445056/220193953 [====>.........................] - ETA: 16s
39034880/220193953 [====>.........................] - ETA: 16s
39624704/220193953 [====>.........................] - ETA: 16s
40214528/220193953 [====>.........................] - ETA: 16s
40804352/220193953 [====>.........................] - ETA: 16s
41394176/220193953 [====>.........................] - ETA: 16s
41984000/220193953 [====>.........................] - ETA: 15s
42573824/220193953 [====>.........................] - ETA: 15s
43163648/220193953 [====>.........................] - ETA: 15s
43753472/220193953 [====>.........................] - ETA: 15s
44343296/220193953 [=====>........................] - ETA: 15s
44933120/220193953 [=====>........................] - ETA: 15s
45522944/220193953 [=====>........................] - ETA: 15s
46112768/220193953 [=====>........................] - ETA: 15s
46702592/220193953 [=====>........................] - ETA: 15s
47292416/220193953 [=====>........................] - ETA: 15s
47882240/220193953 [=====>........................] - ETA: 15s
48455680/220193953 [=====>........................] - ETA: 15s
49045504/220193953 [=====>........................] - ETA: 15s
49635328/220193953 [=====>........................] - ETA: 15s
50225152/220193953 [=====>........................] - ETA: 15s
50814976/220193953 [=====>........................] - ETA: 15s
51404800/220193953 [======>.......................] - ETA: 15s
51994624/220193953 [======>.......................] - ETA: 14s
52584448/220193953 [======>.......................] - ETA: 14s
53174272/220193953 [======>.......................] - ETA: 14s
53764096/220193953 [======>.......................] - ETA: 14s
54353920/220193953 [======>.......................] - ETA: 14s
54943744/220193953 [======>.......................] - ETA: 14s
55533568/220193953 [======>.......................] - ETA: 14s
56123392/220193953 [======>.......................] - ETA: 14s
56713216/220193953 [======>.......................] - ETA: 14s
57303040/220193953 [======>.......................] - ETA: 14s
57892864/220193953 [======>.......................] - ETA: 14s
58482688/220193953 [======>.......................] - ETA: 14s
59072512/220193953 [=======>......................] - ETA: 14s
59662336/220193953 [=======>......................] - ETA: 14s
60252160/220193953 [=======>......................] - ETA: 14s
60841984/220193953 [=======>......................] - ETA: 14s
61431808/220193953 [=======>......................] - ETA: 14s
62021632/220193953 [=======>......................] - ETA: 13s
62611456/220193953 [=======>......................] - ETA: 13s
63201280/220193953 [=======>......................] - ETA: 13s
63791104/220193953 [=======>......................] - ETA: 13s
64380928/220193953 [=======>......................] - ETA: 13s
64970752/220193953 [=======>......................] - ETA: 13s
65560576/220193953 [=======>......................] - ETA: 13s
66150400/220193953 [========>.....................] - ETA: 13s
66674688/220193953 [========>.....................] - ETA: 13s
66772992/220193953 [========>.....................] - ETA: 13s
68395008/220193953 [========>.....................] - ETA: 13s
68689920/220193953 [========>.....................] - ETA: 13s
68804608/220193953 [========>.....................] - ETA: 13s
68820992/220193953 [========>.....................] - ETA: 13s
70262784/220193953 [========>.....................] - ETA: 13s
70852608/220193953 [========>.....................] - ETA: 13s
71442432/220193953 [========>.....................] - ETA: 13s
72015872/220193953 [========>.....................] - ETA: 13s
72605696/220193953 [========>.....................] - ETA: 13s
73195520/220193953 [========>.....................] - ETA: 13s
73752576/220193953 [=========>....................] - ETA: 13s
74014720/220193953 [=========>....................] - ETA: 13s
74506240/220193953 [=========>....................] - ETA: 13s
75014144/220193953 [=========>....................] - ETA: 13s
75587584/220193953 [=========>....................] - ETA: 13s
76177408/220193953 [=========>....................] - ETA: 13s
76767232/220193953 [=========>....................] - ETA: 12s
77324288/220193953 [=========>....................] - ETA: 12s
77914112/220193953 [=========>....................] - ETA: 12s
78471168/220193953 [=========>....................] - ETA: 12s
79060992/220193953 [=========>....................] - ETA: 12s
79650816/220193953 [=========>....................] - ETA: 12s
80191488/220193953 [=========>....................] - ETA: 12s
80683008/220193953 [=========>....................] - ETA: 12s
81240064/220193953 [==========>...................] - ETA: 12s
81829888/220193953 [==========>...................] - ETA: 12s
82419712/220193953 [==========>...................] - ETA: 12s
83009536/220193953 [==========>...................] - ETA: 12s
83599360/220193953 [==========>...................] - ETA: 12s
84189184/220193953 [==========>...................] - ETA: 12s
84779008/220193953 [==========>...................] - ETA: 12s
85368832/220193953 [==========>...................] - ETA: 12s
85958656/220193953 [==========>...................] - ETA: 12s
86548480/220193953 [==========>...................] - ETA: 12s
86925312/220193953 [==========>...................] - ETA: 12s
87351296/220193953 [==========>...................] - ETA: 12s
87678976/220193953 [==========>...................] - ETA: 12s
88268800/220193953 [===========>..................] - ETA: 11s
88809472/220193953 [===========>..................] - ETA: 11s
89399296/220193953 [===========>..................] - ETA: 11s
89989120/220193953 [===========>..................] - ETA: 11s
90578944/220193953 [===========>..................] - ETA: 11s
91168768/220193953 [===========>..................] - ETA: 11s
91758592/220193953 [===========>..................] - ETA: 11s
92348416/220193953 [===========>..................] - ETA: 11s
92938240/220193953 [===========>..................] - ETA: 11s
93528064/220193953 [===========>..................] - ETA: 11s
94117888/220193953 [===========>..................] - ETA: 11s
94707712/220193953 [===========>..................] - ETA: 11s
95002624/220193953 [===========>..................] - ETA: 11s
95477760/220193953 [============>.................] - ETA: 11s
95789056/220193953 [============>.................] - ETA: 11s
96362496/220193953 [============>.................] - ETA: 11s
96755712/220193953 [============>.................] - ETA: 11s
97198080/220193953 [============>.................] - ETA: 11s
97656832/220193953 [============>.................] - ETA: 11s
98148352/220193953 [============>.................] - ETA: 11s
98738176/220193953 [============>.................] - ETA: 11s
99328000/220193953 [============>.................] - ETA: 11s
99917824/220193953 [============>.................] - ETA: 10s
100507648/220193953 [============>.................] - ETA: 10s
101097472/220193953 [============>.................] - ETA: 10s
101523456/220193953 [============>.................] - ETA: 10s
102031360/220193953 [============>.................] - ETA: 10s
102604800/220193953 [============>.................] - ETA: 10s
103178240/220193953 [=============>................] - ETA: 10s
103768064/220193953 [=============>................] - ETA: 10s
104357888/220193953 [=============>................] - ETA: 10s
104947712/220193953 [=============>................] - ETA: 10s
105537536/220193953 [=============>................] - ETA: 10s
106127360/220193953 [=============>................] - ETA: 10s
106717184/220193953 [=============>................] - ETA: 10s
107307008/220193953 [=============>................] - ETA: 10s
107896832/220193953 [=============>................] - ETA: 10s
108486656/220193953 [=============>................] - ETA: 10s
109076480/220193953 [=============>................] - ETA: 10s
109486080/220193953 [=============>................] - ETA: 10s
109879296/220193953 [=============>................] - ETA: 10s
110239744/220193953 [==============>...............] - ETA: 10s
110583808/220193953 [==============>...............] - ETA: 10s
110944256/220193953 [==============>...............] - ETA: 10s
111304704/220193953 [==============>...............] - ETA: 10s
111828992/220193953 [==============>...............] - ETA: 9s
112418816/220193953 [==============>...............] - ETA: 9s
113008640/220193953 [==============>...............] - ETA: 9s
113598464/220193953 [==============>...............] - ETA: 9s
114188288/220193953 [==============>...............] - ETA: 9s
114778112/220193953 [==============>...............] - ETA: 9s
115367936/220193953 [==============>...............] - ETA: 9s
115957760/220193953 [==============>...............] - ETA: 9s
116547584/220193953 [==============>...............] - ETA: 9s
117137408/220193953 [==============>...............] - ETA: 9s
117710848/220193953 [===============>..............] - ETA: 9s
118300672/220193953 [===============>..............] - ETA: 9s
118890496/220193953 [===============>..............] - ETA: 9s
119480320/220193953 [===============>..............] - ETA: 9s
120053760/220193953 [===============>..............] - ETA: 9s
120610816/220193953 [===============>..............] - ETA: 9s
121167872/220193953 [===============>..............] - ETA: 9s
121724928/220193953 [===============>..............] - ETA: 9s
122281984/220193953 [===============>..............] - ETA: 8s
122822656/220193953 [===============>..............] - ETA: 8s
123379712/220193953 [===============>..............] - ETA: 8s
123936768/220193953 [===============>..............] - ETA: 8s
124493824/220193953 [===============>..............] - ETA: 8s
125050880/220193953 [================>.............] - ETA: 8s
125624320/220193953 [================>.............] - ETA: 8s
126214144/220193953 [================>.............] - ETA: 8s
126803968/220193953 [================>.............] - ETA: 8s
127377408/220193953 [================>.............] - ETA: 8s
127967232/220193953 [================>.............] - ETA: 8s
128557056/220193953 [================>.............] - ETA: 8s
129146880/220193953 [================>.............] - ETA: 8s
129736704/220193953 [================>.............] - ETA: 8s
130326528/220193953 [================>.............] - ETA: 8s
130916352/220193953 [================>.............] - ETA: 8s
131506176/220193953 [================>.............] - ETA: 8s
132096000/220193953 [================>.............] - ETA: 8s
132685824/220193953 [=================>............] - ETA: 7s
133275648/220193953 [=================>............] - ETA: 7s
133865472/220193953 [=================>............] - ETA: 7s
134438912/220193953 [=================>............] - ETA: 7s
135028736/220193953 [=================>............] - ETA: 7s
135618560/220193953 [=================>............] - ETA: 7s
136192000/220193953 [=================>............] - ETA: 7s
136781824/220193953 [=================>............] - ETA: 7s
137355264/220193953 [=================>............] - ETA: 7s
137945088/220193953 [=================>............] - ETA: 7s
138534912/220193953 [=================>............] - ETA: 7s
139124736/220193953 [=================>............] - ETA: 7s
139714560/220193953 [==================>...........] - ETA: 7s
140304384/220193953 [==================>...........] - ETA: 7s
140894208/220193953 [==================>...........] - ETA: 7s
141500416/220193953 [==================>...........] - ETA: 7s
142090240/220193953 [==================>...........] - ETA: 7s
142680064/220193953 [==================>...........] - ETA: 7s
143269888/220193953 [==================>...........] - ETA: 6s
143843328/220193953 [==================>...........] - ETA: 6s
144433152/220193953 [==================>...........] - ETA: 6s
145022976/220193953 [==================>...........] - ETA: 6s
145612800/220193953 [==================>...........] - ETA: 6s
146202624/220193953 [==================>...........] - ETA: 6s
146792448/220193953 [==================>...........] - ETA: 6s
147382272/220193953 [===================>..........] - ETA: 6s
147972096/220193953 [===================>..........] - ETA: 6s
148561920/220193953 [===================>..........] - ETA: 6s
149151744/220193953 [===================>..........] - ETA: 6s
149741568/220193953 [===================>..........] - ETA: 6s
150331392/220193953 [===================>..........] - ETA: 6s
150921216/220193953 [===================>..........] - ETA: 6s
151511040/220193953 [===================>..........] - ETA: 6s
152100864/220193953 [===================>..........] - ETA: 6s
152690688/220193953 [===================>..........] - ETA: 6s
153280512/220193953 [===================>..........] - ETA: 6s
153870336/220193953 [===================>..........] - ETA: 5s
154460160/220193953 [====================>.........] - ETA: 5s
155049984/220193953 [====================>.........] - ETA: 5s
155639808/220193953 [====================>.........] - ETA: 5s
156229632/220193953 [====================>.........] - ETA: 5s
156819456/220193953 [====================>.........] - ETA: 5s
157409280/220193953 [====================>.........] - ETA: 5s
157999104/220193953 [====================>.........] - ETA: 5s
158588928/220193953 [====================>.........] - ETA: 5s
159178752/220193953 [====================>.........] - ETA: 5s
159768576/220193953 [====================>.........] - ETA: 5s
160358400/220193953 [====================>.........] - ETA: 5s
160948224/220193953 [====================>.........] - ETA: 5s
161538048/220193953 [=====================>........] - ETA: 5s
162127872/220193953 [=====================>........] - ETA: 5s
162717696/220193953 [=====================>........] - ETA: 5s
163307520/220193953 [=====================>........] - ETA: 5s
163897344/220193953 [=====================>........] - ETA: 5s
164487168/220193953 [=====================>........] - ETA: 5s
165076992/220193953 [=====================>........] - ETA: 4s
165666816/220193953 [=====================>........] - ETA: 4s
166256640/220193953 [=====================>........] - ETA: 4s
166846464/220193953 [=====================>........] - ETA: 4s
167436288/220193953 [=====================>........] - ETA: 4s
168026112/220193953 [=====================>........] - ETA: 4s
168615936/220193953 [=====================>........] - ETA: 4s
169205760/220193953 [======================>.......] - ETA: 4s
169795584/220193953 [======================>.......] - ETA: 4s
170385408/220193953 [======================>.......] - ETA: 4s
170975232/220193953 [======================>.......] - ETA: 4s
171565056/220193953 [======================>.......] - ETA: 4s
172154880/220193953 [======================>.......] - ETA: 4s
172744704/220193953 [======================>.......] - ETA: 4s
173334528/220193953 [======================>.......] - ETA: 4s
173924352/220193953 [======================>.......] - ETA: 4s
174514176/220193953 [======================>.......] - ETA: 4s
175120384/220193953 [======================>.......] - ETA: 4s
175710208/220193953 [======================>.......] - ETA: 3s
176300032/220193953 [=======================>......] - ETA: 3s
176889856/220193953 [=======================>......] - ETA: 3s
177479680/220193953 [=======================>......] - ETA: 3s
178069504/220193953 [=======================>......] - ETA: 3s
178659328/220193953 [=======================>......] - ETA: 3s
179249152/220193953 [=======================>......] - ETA: 3s
179838976/220193953 [=======================>......] - ETA: 3s
180428800/220193953 [=======================>......] - ETA: 3s
181018624/220193953 [=======================>......] - ETA: 3s
181608448/220193953 [=======================>......] - ETA: 3s
182198272/220193953 [=======================>......] - ETA: 3s
182788096/220193953 [=======================>......] - ETA: 3s
183377920/220193953 [=======================>......] - ETA: 3s
183967744/220193953 [========================>.....] - ETA: 3s
184557568/220193953 [========================>.....] - ETA: 3s
185147392/220193953 [========================>.....] - ETA: 3s
185737216/220193953 [========================>.....] - ETA: 3s
186327040/220193953 [========================>.....] - ETA: 3s
186916864/220193953 [========================>.....] - ETA: 2s
187506688/220193953 [========================>.....] - ETA: 2s
188096512/220193953 [========================>.....] - ETA: 2s
188686336/220193953 [========================>.....] - ETA: 2s
189276160/220193953 [========================>.....] - ETA: 2s
189849600/220193953 [========================>.....] - ETA: 2s
189997056/220193953 [========================>.....] - ETA: 2s
190128128/220193953 [========================>.....] - ETA: 2s
191258624/220193953 [=========================>....] - ETA: 2s
191389696/220193953 [=========================>....] - ETA: 2s
191913984/220193953 [=========================>....] - ETA: 2s
192454656/220193953 [=========================>....] - ETA: 2s
193044480/220193953 [=========================>....] - ETA: 2s
193634304/220193953 [=========================>....] - ETA: 2s
194224128/220193953 [=========================>....] - ETA: 2s
194813952/220193953 [=========================>....] - ETA: 2s
195403776/220193953 [=========================>....] - ETA: 2s
195993600/220193953 [=========================>....] - ETA: 2s
196583424/220193953 [=========================>....] - ETA: 2s
197173248/220193953 [=========================>....] - ETA: 2s
197763072/220193953 [=========================>....] - ETA: 2s
198352896/220193953 [==========================>...] - ETA: 1s
198942720/220193953 [==========================>...] - ETA: 1s
199532544/220193953 [==========================>...] - ETA: 1s
200122368/220193953 [==========================>...] - ETA: 1s
200712192/220193953 [==========================>...] - ETA: 1s
201302016/220193953 [==========================>...] - ETA: 1s
201891840/220193953 [==========================>...] - ETA: 1s
202481664/220193953 [==========================>...] - ETA: 1s
203071488/220193953 [==========================>...] - ETA: 1s
203661312/220193953 [==========================>...] - ETA: 1s
204234752/220193953 [==========================>...] - ETA: 1s
204824576/220193953 [==========================>...] - ETA: 1s
205414400/220193953 [==========================>...] - ETA: 1s
205987840/220193953 [===========================>..] - ETA: 1s
206561280/220193953 [===========================>..] - ETA: 1s
207118336/220193953 [===========================>..] - ETA: 1s
207691776/220193953 [===========================>..] - ETA: 1s
208281600/220193953 [===========================>..] - ETA: 1s
208871424/220193953 [===========================>..] - ETA: 1s
209461248/220193953 [===========================>..] - ETA: 0s
210051072/220193953 [===========================>..] - ETA: 0s
210624512/220193953 [===========================>..] - ETA: 0s
211181568/220193953 [===========================>..] - ETA: 0s
211755008/220193953 [===========================>..] - ETA: 0s
212344832/220193953 [===========================>..] - ETA: 0s
212934656/220193953 [============================>.] - ETA: 0s
213508096/220193953 [============================>.] - ETA: 0s
214065152/220193953 [============================>.] - ETA: 0s
214605824/220193953 [============================>.] - ETA: 0s
215146496/220193953 [============================>.] - ETA: 0s
215719936/220193953 [============================>.] - ETA: 0s
216309760/220193953 [============================>.] - ETA: 0s
216850432/220193953 [============================>.] - ETA: 0s
217440256/220193953 [============================>.] - ETA: 0s
218030080/220193953 [============================>.] - ETA: 0s
218603520/220193953 [============================>.] - ETA: 0s
219209728/220193953 [============================>.] - ETA: 0s
219799552/220193953 [============================>.] - ETA: 0s
220193953/220193953 [==============================] - 20s 0us/step
Loaded VOC2007 test data for car and person classes: 2500 images.
Anchors can also be computed easily using YOLO toolkit.
Note
The following code is given as an example. In a real use case scenario, anchors are computed on the training dataset.
from akida_models.detection.generate_anchors import generate_anchors
num_anchors = 5
grid_size = (7, 7)
anchors_example = generate_anchors(val_dataset, num_anchors, grid_size)
Average IOU for 5 anchors: 0.62
Anchors: [[0.55213, 1.13669], [1.39658, 2.04627], [1.53616, 4.24362], [3.04205, 4.73053], [5.55387, 5.45918]]
3. Model architecture
The model zoo contains a YOLO model that is built upon the AkidaNet architecture and 3 separable convolutional layers at the top for bounding box and class estimation followed by a final separable convolutional which is the detection layer. Note that for efficiency, the alpha parameter in AkidaNet (network width or number of filter in each layer) is set to 0.5.
from akida_models import yolo_base
# Create a yolo model for 2 classes with 5 anchors and grid size of 7
classes = 2
model = yolo_base(input_shape=(224, 224, 3),
classes=classes,
nb_box=num_anchors,
alpha=0.5)
model.summary()
Model: "yolo_base"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, 224, 224, 3)] 0
rescaling (Rescaling) (None, 224, 224, 3) 0
conv_0 (Conv2D) (None, 112, 112, 16) 432
conv_0/BN (BatchNormalizati (None, 112, 112, 16) 64
on)
conv_0/relu (ReLU) (None, 112, 112, 16) 0
conv_1 (Conv2D) (None, 112, 112, 32) 4608
conv_1/BN (BatchNormalizati (None, 112, 112, 32) 128
on)
conv_1/relu (ReLU) (None, 112, 112, 32) 0
conv_2 (Conv2D) (None, 56, 56, 64) 18432
conv_2/BN (BatchNormalizati (None, 56, 56, 64) 256
on)
conv_2/relu (ReLU) (None, 56, 56, 64) 0
conv_3 (Conv2D) (None, 56, 56, 64) 36864
conv_3/BN (BatchNormalizati (None, 56, 56, 64) 256
on)
conv_3/relu (ReLU) (None, 56, 56, 64) 0
dw_separable_4 (DepthwiseCo (None, 28, 28, 64) 576
nv2D)
pw_separable_4 (Conv2D) (None, 28, 28, 128) 8192
pw_separable_4/BN (BatchNor (None, 28, 28, 128) 512
malization)
pw_separable_4/relu (ReLU) (None, 28, 28, 128) 0
dw_separable_5 (DepthwiseCo (None, 28, 28, 128) 1152
nv2D)
pw_separable_5 (Conv2D) (None, 28, 28, 128) 16384
pw_separable_5/BN (BatchNor (None, 28, 28, 128) 512
malization)
pw_separable_5/relu (ReLU) (None, 28, 28, 128) 0
dw_separable_6 (DepthwiseCo (None, 14, 14, 128) 1152
nv2D)
pw_separable_6 (Conv2D) (None, 14, 14, 256) 32768
pw_separable_6/BN (BatchNor (None, 14, 14, 256) 1024
malization)
pw_separable_6/relu (ReLU) (None, 14, 14, 256) 0
dw_separable_7 (DepthwiseCo (None, 14, 14, 256) 2304
nv2D)
pw_separable_7 (Conv2D) (None, 14, 14, 256) 65536
pw_separable_7/BN (BatchNor (None, 14, 14, 256) 1024
malization)
pw_separable_7/relu (ReLU) (None, 14, 14, 256) 0
dw_separable_8 (DepthwiseCo (None, 14, 14, 256) 2304
nv2D)
pw_separable_8 (Conv2D) (None, 14, 14, 256) 65536
pw_separable_8/BN (BatchNor (None, 14, 14, 256) 1024
malization)
pw_separable_8/relu (ReLU) (None, 14, 14, 256) 0
dw_separable_9 (DepthwiseCo (None, 14, 14, 256) 2304
nv2D)
pw_separable_9 (Conv2D) (None, 14, 14, 256) 65536
pw_separable_9/BN (BatchNor (None, 14, 14, 256) 1024
malization)
pw_separable_9/relu (ReLU) (None, 14, 14, 256) 0
dw_separable_10 (DepthwiseC (None, 14, 14, 256) 2304
onv2D)
pw_separable_10 (Conv2D) (None, 14, 14, 256) 65536
pw_separable_10/BN (BatchNo (None, 14, 14, 256) 1024
rmalization)
pw_separable_10/relu (ReLU) (None, 14, 14, 256) 0
dw_separable_11 (DepthwiseC (None, 14, 14, 256) 2304
onv2D)
pw_separable_11 (Conv2D) (None, 14, 14, 256) 65536
pw_separable_11/BN (BatchNo (None, 14, 14, 256) 1024
rmalization)
pw_separable_11/relu (ReLU) (None, 14, 14, 256) 0
dw_separable_12 (DepthwiseC (None, 7, 7, 256) 2304
onv2D)
pw_separable_12 (Conv2D) (None, 7, 7, 512) 131072
pw_separable_12/BN (BatchNo (None, 7, 7, 512) 2048
rmalization)
pw_separable_12/relu (ReLU) (None, 7, 7, 512) 0
dw_separable_13 (DepthwiseC (None, 7, 7, 512) 4608
onv2D)
pw_separable_13 (Conv2D) (None, 7, 7, 512) 262144
pw_separable_13/BN (BatchNo (None, 7, 7, 512) 2048
rmalization)
pw_separable_13/relu (ReLU) (None, 7, 7, 512) 0
dw_1conv (DepthwiseConv2D) (None, 7, 7, 512) 4608
pw_1conv (Conv2D) (None, 7, 7, 1024) 524288
pw_1conv/BN (BatchNormaliza (None, 7, 7, 1024) 4096
tion)
pw_1conv/relu (ReLU) (None, 7, 7, 1024) 0
dw_2conv (DepthwiseConv2D) (None, 7, 7, 1024) 9216
pw_2conv (Conv2D) (None, 7, 7, 1024) 1048576
pw_2conv/BN (BatchNormaliza (None, 7, 7, 1024) 4096
tion)
pw_2conv/relu (ReLU) (None, 7, 7, 1024) 0
dw_3conv (DepthwiseConv2D) (None, 7, 7, 1024) 9216
pw_3conv (Conv2D) (None, 7, 7, 1024) 1048576
pw_3conv/BN (BatchNormaliza (None, 7, 7, 1024) 4096
tion)
pw_3conv/relu (ReLU) (None, 7, 7, 1024) 0
dw_detection_layer (Depthwi (None, 7, 7, 1024) 9216
seConv2D)
pw_detection_layer (Conv2D) (None, 7, 7, 35) 35875
=================================================================
Total params: 3,573,715
Trainable params: 3,561,587
Non-trainable params: 12,128
_________________________________________________________________
The model output can be reshaped to a more natural shape of:
(grid_height, grid_width, anchors_box, 4 + 1 + num_classes)
where the “4 + 1” term represents the coordinates of the estimated bounding boxes (top left x, top left y, width and height) and a confidence score. In other words, the output channels are actually grouped by anchor boxes, and in each group one channel provides either a coordinate, a global confidence score or a class confidence score. This process is done automatically in the decode_output function.
from tensorflow.keras import Model
from tensorflow.keras.layers import Reshape
# Define a reshape output to be added to the YOLO model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
name="YOLO_output")(model.output)
# Build the complete model
full_model = Model(model.input, output)
full_model.output
<KerasTensor: shape=(None, 7, 7, 5, 7) dtype=float32 (created by layer 'YOLO_output')>
4. Training
As the YOLO model relies on Brainchip AkidaNet/ImageNet network, it is possible to perform transfer learning from ImageNet pretrained weights when training a YOLO model. See the PlantVillage transfer learning example for a detail explanation on transfer learning principles.
5. Performance
The model zoo also contains an helper method that allows to create a YOLO model for VOC and load pretrained weights for the car and person detection task and the corresponding anchors. The anchors are used to interpret the model outputs.
The metric used to evaluate YOLO is the mean average precision (mAP) which is the percentage of correct prediction and is given for an intersection over union (IoU) ratio. Scores in this example are given for the standard IoU of 0.5 meaning that a detection is considered valid if the intersection over union ratio with its ground truth equivalent is above 0.5.
Note
A call to evaluate_map will preprocess the images, make the call to
Model.predict
and use decode_output before computing precision for all classes.
from timeit import default_timer as timer
from akida_models import yolo_voc_pretrained
from akida_models.detection.map_evaluation import MapEvaluation
# Load the pretrained model along with anchors
model_keras, anchors = yolo_voc_pretrained()
model_keras.summary()
Downloading data from https://data.brainchip.com/dataset-mirror/voc/voc_anchors.pkl.
0/126 [..............................] - ETA: 0s
126/126 [==============================] - 0s 2us/step
Downloading data from https://data.brainchip.com/models/AkidaV2/yolo/yolo_akidanet_voc_i8_w4_a4.h5.
0/14557704 [..............................] - ETA: 0s
98304/14557704 [..............................] - ETA: 7s
598016/14557704 [>.............................] - ETA: 2s
1179648/14557704 [=>............................] - ETA: 1s
1785856/14557704 [==>...........................] - ETA: 1s
2367488/14557704 [===>..........................] - ETA: 1s
2957312/14557704 [=====>........................] - ETA: 1s
3538944/14557704 [======>.......................] - ETA: 1s
4112384/14557704 [=======>......................] - ETA: 1s
4710400/14557704 [========>.....................] - ETA: 0s
5292032/14557704 [=========>....................] - ETA: 0s
5873664/14557704 [===========>..................] - ETA: 0s
6471680/14557704 [============>.................] - ETA: 0s
7053312/14557704 [=============>................] - ETA: 0s
7634944/14557704 [==============>...............] - ETA: 0s
8241152/14557704 [===============>..............] - ETA: 0s
8814592/14557704 [=================>............] - ETA: 0s
9404416/14557704 [==================>...........] - ETA: 0s
10010624/14557704 [===================>..........] - ETA: 0s
10592256/14557704 [====================>.........] - ETA: 0s
11182080/14557704 [======================>.......] - ETA: 0s
11771904/14557704 [=======================>......] - ETA: 0s
12148736/14557704 [========================>.....] - ETA: 0s
12279808/14557704 [========================>.....] - ETA: 0s
12328960/14557704 [========================>.....] - ETA: 0s
13615104/14557704 [===========================>..] - ETA: 0s
13746176/14557704 [===========================>..] - ETA: 0s
14057472/14557704 [===========================>..] - ETA: 0s
14557704/14557704 [==============================] - 1s 0us/step
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, 224, 224, 3)] 0
rescaling (QuantizedRescali (None, 224, 224, 3) 0
ng)
conv_0 (QuantizedConv2D) (None, 112, 112, 16) 448
conv_0/relu (QuantizedReLU) (None, 112, 112, 16) 32
conv_1 (QuantizedConv2D) (None, 112, 112, 32) 4640
conv_1/relu (QuantizedReLU) (None, 112, 112, 32) 64
conv_2 (QuantizedConv2D) (None, 56, 56, 64) 18496
conv_2/relu (QuantizedReLU) (None, 56, 56, 64) 128
conv_3 (QuantizedConv2D) (None, 56, 56, 64) 36928
conv_3/relu (QuantizedReLU) (None, 56, 56, 64) 128
dw_separable_4 (QuantizedDe (None, 28, 28, 64) 704
pthwiseConv2D)
pw_separable_4 (QuantizedCo (None, 28, 28, 128) 8320
nv2D)
pw_separable_4/relu (Quanti (None, 28, 28, 128) 256
zedReLU)
dw_separable_5 (QuantizedDe (None, 28, 28, 128) 1408
pthwiseConv2D)
pw_separable_5 (QuantizedCo (None, 28, 28, 128) 16512
nv2D)
pw_separable_5/relu (Quanti (None, 28, 28, 128) 256
zedReLU)
dw_separable_6 (QuantizedDe (None, 14, 14, 128) 1408
pthwiseConv2D)
pw_separable_6 (QuantizedCo (None, 14, 14, 256) 33024
nv2D)
pw_separable_6/relu (Quanti (None, 14, 14, 256) 512
zedReLU)
dw_separable_7 (QuantizedDe (None, 14, 14, 256) 2816
pthwiseConv2D)
pw_separable_7 (QuantizedCo (None, 14, 14, 256) 65792
nv2D)
pw_separable_7/relu (Quanti (None, 14, 14, 256) 512
zedReLU)
dw_separable_8 (QuantizedDe (None, 14, 14, 256) 2816
pthwiseConv2D)
pw_separable_8 (QuantizedCo (None, 14, 14, 256) 65792
nv2D)
pw_separable_8/relu (Quanti (None, 14, 14, 256) 512
zedReLU)
dw_separable_9 (QuantizedDe (None, 14, 14, 256) 2816
pthwiseConv2D)
pw_separable_9 (QuantizedCo (None, 14, 14, 256) 65792
nv2D)
pw_separable_9/relu (Quanti (None, 14, 14, 256) 512
zedReLU)
dw_separable_10 (QuantizedD (None, 14, 14, 256) 2816
epthwiseConv2D)
pw_separable_10 (QuantizedC (None, 14, 14, 256) 65792
onv2D)
pw_separable_10/relu (Quant (None, 14, 14, 256) 512
izedReLU)
dw_separable_11 (QuantizedD (None, 14, 14, 256) 2816
epthwiseConv2D)
pw_separable_11 (QuantizedC (None, 14, 14, 256) 65792
onv2D)
pw_separable_11/relu (Quant (None, 14, 14, 256) 512
izedReLU)
dw_separable_12 (QuantizedD (None, 7, 7, 256) 2816
epthwiseConv2D)
pw_separable_12 (QuantizedC (None, 7, 7, 512) 131584
onv2D)
pw_separable_12/relu (Quant (None, 7, 7, 512) 1024
izedReLU)
dw_separable_13 (QuantizedD (None, 7, 7, 512) 5632
epthwiseConv2D)
pw_separable_13 (QuantizedC (None, 7, 7, 512) 262656
onv2D)
pw_separable_13/relu (Quant (None, 7, 7, 512) 1024
izedReLU)
dw_1conv (QuantizedDepthwis (None, 7, 7, 512) 5632
eConv2D)
pw_1conv (QuantizedConv2D) (None, 7, 7, 1024) 525312
pw_1conv/relu (QuantizedReL (None, 7, 7, 1024) 2048
U)
dw_2conv (QuantizedDepthwis (None, 7, 7, 1024) 11264
eConv2D)
pw_2conv (QuantizedConv2D) (None, 7, 7, 1024) 1049600
pw_2conv/relu (QuantizedReL (None, 7, 7, 1024) 2048
U)
dw_3conv (QuantizedDepthwis (None, 7, 7, 1024) 11264
eConv2D)
pw_3conv (QuantizedConv2D) (None, 7, 7, 1024) 1049600
pw_3conv/relu (QuantizedReL (None, 7, 7, 1024) 2048
U)
dw_detection_layer (Quantiz (None, 7, 7, 1024) 11264
edDepthwiseConv2D)
pw_detection_layer (Quantiz (None, 7, 7, 35) 35875
edConv2D)
dequantizer (Dequantizer) (None, 7, 7, 35) 0
=================================================================
Total params: 3,579,555
Trainable params: 3,555,523
Non-trainable params: 24,032
_________________________________________________________________
# Define the final reshape and build the model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
name="YOLO_output")(model_keras.output)
model_keras = Model(model_keras.input, output)
# Create the mAP evaluator object
num_images = 100
map_evaluator = MapEvaluation(model_keras, val_dataset.take(num_images),
num_images, labels, anchors)
# Compute the scores for all validation images
start = timer()
mAP, average_precisions = map_evaluator.evaluate_map()
end = timer()
for label, average_precision in average_precisions.items():
print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP))
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')
car 0.5109
person 0.4080
mAP: 0.4595
Keras inference on 100 images took 15.95 s.
6. Conversion to Akida
6.1 Convert to Akida model
The last YOLO_output layer that was added for splitting channels into values for each box must be removed before Akida conversion.
# Rebuild a model without the last layer
compatible_model = Model(model_keras.input, model_keras.layers[-2].output)
When converting to an Akida model, we just need to pass the Keras model to cnn2snn.convert.
from cnn2snn import convert
model_akida = convert(compatible_model)
model_akida.summary()
Model Summary
________________________________________________
Input shape Output shape Sequences Layers
================================================
[224, 224, 3] [7, 7, 35] 1 33
________________________________________________
__________________________________________________________________________
Layer (type) Output shape Kernel shape
==================== SW/conv_0-dequantizer (Software) ====================
conv_0 (InputConv2D) [112, 112, 16] (3, 3, 3, 16)
__________________________________________________________________________
conv_1 (Conv2D) [112, 112, 32] (3, 3, 16, 32)
__________________________________________________________________________
conv_2 (Conv2D) [56, 56, 64] (3, 3, 32, 64)
__________________________________________________________________________
conv_3 (Conv2D) [56, 56, 64] (3, 3, 64, 64)
__________________________________________________________________________
dw_separable_4 (DepthwiseConv2D) [28, 28, 64] (3, 3, 64, 1)
__________________________________________________________________________
pw_separable_4 (Conv2D) [28, 28, 128] (1, 1, 64, 128)
__________________________________________________________________________
dw_separable_5 (DepthwiseConv2D) [28, 28, 128] (3, 3, 128, 1)
__________________________________________________________________________
pw_separable_5 (Conv2D) [28, 28, 128] (1, 1, 128, 128)
__________________________________________________________________________
dw_separable_6 (DepthwiseConv2D) [14, 14, 128] (3, 3, 128, 1)
__________________________________________________________________________
pw_separable_6 (Conv2D) [14, 14, 256] (1, 1, 128, 256)
__________________________________________________________________________
dw_separable_7 (DepthwiseConv2D) [14, 14, 256] (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_7 (Conv2D) [14, 14, 256] (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_8 (DepthwiseConv2D) [14, 14, 256] (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_8 (Conv2D) [14, 14, 256] (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_9 (DepthwiseConv2D) [14, 14, 256] (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_9 (Conv2D) [14, 14, 256] (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_10 (DepthwiseConv2D) [14, 14, 256] (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_10 (Conv2D) [14, 14, 256] (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_11 (DepthwiseConv2D) [14, 14, 256] (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_11 (Conv2D) [14, 14, 256] (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_12 (DepthwiseConv2D) [7, 7, 256] (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_12 (Conv2D) [7, 7, 512] (1, 1, 256, 512)
__________________________________________________________________________
dw_separable_13 (DepthwiseConv2D) [7, 7, 512] (3, 3, 512, 1)
__________________________________________________________________________
pw_separable_13 (Conv2D) [7, 7, 512] (1, 1, 512, 512)
__________________________________________________________________________
dw_1conv (DepthwiseConv2D) [7, 7, 512] (3, 3, 512, 1)
__________________________________________________________________________
pw_1conv (Conv2D) [7, 7, 1024] (1, 1, 512, 1024)
__________________________________________________________________________
dw_2conv (DepthwiseConv2D) [7, 7, 1024] (3, 3, 1024, 1)
__________________________________________________________________________
pw_2conv (Conv2D) [7, 7, 1024] (1, 1, 1024, 1024)
__________________________________________________________________________
dw_3conv (DepthwiseConv2D) [7, 7, 1024] (3, 3, 1024, 1)
__________________________________________________________________________
pw_3conv (Conv2D) [7, 7, 1024] (1, 1, 1024, 1024)
__________________________________________________________________________
dw_detection_layer (DepthwiseConv2D) [7, 7, 1024] (3, 3, 1024, 1)
__________________________________________________________________________
pw_detection_layer (Conv2D) [7, 7, 35] (1, 1, 1024, 35)
__________________________________________________________________________
dequantizer (Dequantizer) [7, 7, 35] N/A
__________________________________________________________________________
6.1 Check performance
Akida model accuracy is tested on the first n images of the validation set.
# Create the mAP evaluator object
map_evaluator_ak = MapEvaluation(model_akida,
val_dataset.take(num_images),
num_images,
labels,
anchors,
is_keras_model=False)
# Compute the scores for all validation images
start = timer()
mAP_ak, average_precisions_ak = map_evaluator_ak.evaluate_map()
end = timer()
for label, average_precision in average_precisions_ak.items():
print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP_ak))
print(f'Akida inference on {num_images} images took {end-start:.2f} s.\n')
car 0.5147
person 0.3945
mAP: 0.4546
Akida inference on 100 images took 14.61 s.
6.2 Show predictions for a random image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from akida_models.detection.processing import preprocess_image, decode_output
# Shuffle the data to take a random test image
val_dataset = val_dataset.shuffle(buffer_size=num_images)
input_shape = model_akida.layers[0].input_dims
# Load the image
raw_image = next(iter(val_dataset))['image']
# Keep the original image size for later bounding boxes rescaling
raw_height, raw_width, _ = raw_image.shape
# Pre-process the image
image = preprocess_image(raw_image, input_shape)
input_image = image[np.newaxis, :].astype(np.uint8)
# Call evaluate on the image
pots = model_akida.predict(input_image)[0]
# Reshape the potentials to prepare for decoding
h, w, c = pots.shape
pots = pots.reshape((h, w, len(anchors), 4 + 1 + len(labels)))
# Decode potentials into bounding boxes
raw_boxes = decode_output(pots, anchors, len(labels))
# Rescale boxes to the original image size
pred_boxes = np.array([[
box.x1 * raw_width, box.y1 * raw_height, box.x2 * raw_width,
box.y2 * raw_height,
box.get_label(),
box.get_score()
] for box in raw_boxes])
fig = plt.figure(num='VOC2012 car and person detection by Akida')
ax = fig.subplots(1)
img_plot = ax.imshow(np.zeros(raw_image.shape, dtype=np.uint8))
img_plot.set_data(raw_image)
for box in pred_boxes:
rect = patches.Rectangle((box[0], box[1]),
box[2] - box[0],
box[3] - box[1],
linewidth=1,
edgecolor='r',
facecolor='none')
ax.add_patch(rect)
class_score = ax.text(box[0],
box[1] - 5,
f"{labels[int(box[4])]} - {box[5]:.2f}",
color='red')
plt.axis('off')
plt.show()
Total running time of the script: (1 minutes 15.124 seconds)