YOLO/PASCAL-VOC detection tutorial

This tutorial demonstrates that Akida can perform object detection. This is illustrated using a subset of the PASCAL-VOC 2007 dataset with “car” and “person” classes only. The YOLOv2 architecture from Redmon et al (2016) has been chosen to tackle this object detection problem.

1. Introduction

1.1 Object detection

Object detection is a computer vision task that combines two elemental tasks:

  • object classification that consists in assigning a class label to an image like shown in the AkidaNet/ImageNet inference example

  • object localization that consists of drawing a bounding box around one or several objects in an image

One can learn more about the subject by reading this introduction to object detection blog article.

1.2 YOLO key concepts

You Only Look Once (YOLO) is a deep neural network architecture dedicated to object detection.

As opposed to classic networks that handle object detection, YOLO predicts bounding boxes (localization task) and class probabilities (classification task) from a single neural network in a single evaluation. The object detection task is reduced to a regression problem to spatially separated boxes and associated class probabilities.

YOLO base concept is to divide an input image into regions, forming a grid, and to predict bounding boxes and probabilities for each region. The bounding boxes are weighted by the prediction probabilities.

YOLO also uses the concept of “anchors boxes” or “prior boxes”. The network does not actually predict the actual bounding boxes but offsets from anchors boxes which are templates (width/height ratio) computed by clustering the dimensions of the ground truth boxes from the training dataset. The anchors then represent the average shape and size of the objects to detect. More details on the anchors boxes concept are given in this blog article.

Additional information about YOLO can be found on the Darknet website and source code for the preprocessing and postprocessing functions that are included in akida_models package (see the processing section in the model zoo) is largely inspired from experiencor github.

2. Preprocessing tools

As this example focuses on car and person detection only, a subset of VOC has been prepared with test images from VOC2007 that contains at least one of the two classes. The dataset is represented as a tfrecord file, containing images, labels, and bounding boxes.

The load_tf_dataset function is a helper function that facilitates the loading and parsing of the tfrecord file.

The YOLO toolkit offers several methods to prepare data for processing, see load_image, preprocess_image.

import tensorflow as tf

from akida_models import fetch_file

# Download TFrecords test set from Brainchip data server
data_path = fetch_file(
    fname="voc_test_car_person.tfrecord",
    origin="https://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tfrecord",
    cache_subdir='datasets/voc',
    extract=True)


# Helper function to load and parse the Tfrecord file.
def load_tf_dataset(tf_record_file_path):
    tfrecord_files = [tf_record_file_path]

    # Feature description for parsing the TFRecord
    feature_description = {
        'image': tf.io.FixedLenFeature([], tf.string),
        'objects/bbox': tf.io.VarLenFeature(tf.float32),
        'objects/label': tf.io.VarLenFeature(tf.int64),
    }

    def _count_tfrecord_examples(dataset):
        return len(list(dataset.as_numpy_iterator()))

    def _parse_tfrecord_fn(example_proto):
        example = tf.io.parse_single_example(example_proto, feature_description)

        # Decode the image from bytes
        example['image'] = tf.io.decode_jpeg(example['image'], channels=3)

        # Convert the VarLenFeature to a dense tensor
        example['objects/label'] = tf.sparse.to_dense(example['objects/label'], default_value=0)

        example['objects/bbox'] = tf.sparse.to_dense(example['objects/bbox'])
        # Boxes were flattenned that's why we need to reshape them
        example['objects/bbox'] = tf.reshape(example['objects/bbox'],
                                             (tf.shape(example['objects/label'])[0], 4))
        # Create a new dictionary structure
        objects = {
            'label': example['objects/label'],
            'bbox': example['objects/bbox'],
        }

        # Remove unnecessary keys
        example.pop('objects/label')
        example.pop('objects/bbox')

        # Add 'objects' key to the main dictionary
        example['objects'] = objects

        return example

    # Create a TFRecordDataset
    dataset = tf.data.TFRecordDataset(tfrecord_files)
    len_dataset = _count_tfrecord_examples(dataset)
    parsed_dataset = dataset.map(_parse_tfrecord_fn)

    return parsed_dataset, len_dataset


labels = ['car', 'person']

val_dataset, len_val_dataset = load_tf_dataset(data_path)
print("Loaded VOC2007 test data for car and person classes: "
      f"{len_val_dataset} images.")
Downloading data from https://data.brainchip.com/dataset-mirror/voc/voc_test_car_person.tfrecord.

        0/220193953 [..............................] - ETA: 0s
   106496/220193953 [..............................] - ETA: 1:44
   598016/220193953 [..............................] - ETA: 37s 
  1187840/220193953 [..............................] - ETA: 28s
  1769472/220193953 [..............................] - ETA: 25s
  2367488/220193953 [..............................] - ETA: 23s
  2957312/220193953 [..............................] - ETA: 22s
  3538944/220193953 [..............................] - ETA: 21s
  4128768/220193953 [..............................] - ETA: 21s
  4718592/220193953 [..............................] - ETA: 20s
  5300224/220193953 [..............................] - ETA: 20s
  5873664/220193953 [..............................] - ETA: 20s
  6471680/220193953 [..............................] - ETA: 20s
  7045120/220193953 [..............................] - ETA: 19s
  7643136/220193953 [>.............................] - ETA: 19s
  8224768/220193953 [>.............................] - ETA: 19s
  8790016/220193953 [>.............................] - ETA: 19s
  9379840/220193953 [>.............................] - ETA: 19s
  9969664/220193953 [>.............................] - ETA: 19s
 10559488/220193953 [>.............................] - ETA: 19s
 11116544/220193953 [>.............................] - ETA: 19s
 11673600/220193953 [>.............................] - ETA: 18s
 12230656/220193953 [>.............................] - ETA: 18s
 12787712/220193953 [>.............................] - ETA: 18s
 13328384/220193953 [>.............................] - ETA: 18s
 13869056/220193953 [>.............................] - ETA: 18s
 14426112/220193953 [>.............................] - ETA: 18s
 14983168/220193953 [=>............................] - ETA: 18s
 15540224/220193953 [=>............................] - ETA: 18s
 16097280/220193953 [=>............................] - ETA: 18s
 16670720/220193953 [=>............................] - ETA: 18s
 17244160/220193953 [=>............................] - ETA: 18s
 17833984/220193953 [=>............................] - ETA: 18s
 18423808/220193953 [=>............................] - ETA: 18s
 19013632/220193953 [=>............................] - ETA: 18s
 19603456/220193953 [=>............................] - ETA: 18s
 19832832/220193953 [=>............................] - ETA: 19s
 20701184/220193953 [=>............................] - ETA: 19s
 21995520/220193953 [=>............................] - ETA: 18s
 22585344/220193953 [==>...........................] - ETA: 18s
 23175168/220193953 [==>...........................] - ETA: 18s
 23764992/220193953 [==>...........................] - ETA: 18s
 24354816/220193953 [==>...........................] - ETA: 18s
 24944640/220193953 [==>...........................] - ETA: 18s
 25534464/220193953 [==>...........................] - ETA: 17s
 26124288/220193953 [==>...........................] - ETA: 17s
 26714112/220193953 [==>...........................] - ETA: 17s
 27303936/220193953 [==>...........................] - ETA: 17s
 27893760/220193953 [==>...........................] - ETA: 17s
 28483584/220193953 [==>...........................] - ETA: 17s
 29057024/220193953 [==>...........................] - ETA: 17s
 29646848/220193953 [===>..........................] - ETA: 17s
 30236672/220193953 [===>..........................] - ETA: 17s
 30826496/220193953 [===>..........................] - ETA: 17s
 31416320/220193953 [===>..........................] - ETA: 17s
 32006144/220193953 [===>..........................] - ETA: 17s
 32595968/220193953 [===>..........................] - ETA: 17s
 33185792/220193953 [===>..........................] - ETA: 16s
 33775616/220193953 [===>..........................] - ETA: 16s
 34349056/220193953 [===>..........................] - ETA: 16s
 34938880/220193953 [===>..........................] - ETA: 16s
 35528704/220193953 [===>..........................] - ETA: 16s
 36118528/220193953 [===>..........................] - ETA: 16s
 36675584/220193953 [===>..........................] - ETA: 16s
 37265408/220193953 [====>.........................] - ETA: 16s
 37855232/220193953 [====>.........................] - ETA: 16s
 38445056/220193953 [====>.........................] - ETA: 16s
 39034880/220193953 [====>.........................] - ETA: 16s
 39624704/220193953 [====>.........................] - ETA: 16s
 40214528/220193953 [====>.........................] - ETA: 16s
 40804352/220193953 [====>.........................] - ETA: 16s
 41394176/220193953 [====>.........................] - ETA: 16s
 41984000/220193953 [====>.........................] - ETA: 15s
 42573824/220193953 [====>.........................] - ETA: 15s
 43163648/220193953 [====>.........................] - ETA: 15s
 43753472/220193953 [====>.........................] - ETA: 15s
 44343296/220193953 [=====>........................] - ETA: 15s
 44933120/220193953 [=====>........................] - ETA: 15s
 45522944/220193953 [=====>........................] - ETA: 15s
 46112768/220193953 [=====>........................] - ETA: 15s
 46702592/220193953 [=====>........................] - ETA: 15s
 47292416/220193953 [=====>........................] - ETA: 15s
 47882240/220193953 [=====>........................] - ETA: 15s
 48455680/220193953 [=====>........................] - ETA: 15s
 49045504/220193953 [=====>........................] - ETA: 15s
 49635328/220193953 [=====>........................] - ETA: 15s
 50225152/220193953 [=====>........................] - ETA: 15s
 50814976/220193953 [=====>........................] - ETA: 15s
 51404800/220193953 [======>.......................] - ETA: 15s
 51994624/220193953 [======>.......................] - ETA: 14s
 52584448/220193953 [======>.......................] - ETA: 14s
 53174272/220193953 [======>.......................] - ETA: 14s
 53764096/220193953 [======>.......................] - ETA: 14s
 54353920/220193953 [======>.......................] - ETA: 14s
 54943744/220193953 [======>.......................] - ETA: 14s
 55533568/220193953 [======>.......................] - ETA: 14s
 56123392/220193953 [======>.......................] - ETA: 14s
 56713216/220193953 [======>.......................] - ETA: 14s
 57303040/220193953 [======>.......................] - ETA: 14s
 57892864/220193953 [======>.......................] - ETA: 14s
 58482688/220193953 [======>.......................] - ETA: 14s
 59072512/220193953 [=======>......................] - ETA: 14s
 59662336/220193953 [=======>......................] - ETA: 14s
 60252160/220193953 [=======>......................] - ETA: 14s
 60841984/220193953 [=======>......................] - ETA: 14s
 61431808/220193953 [=======>......................] - ETA: 14s
 62021632/220193953 [=======>......................] - ETA: 13s
 62611456/220193953 [=======>......................] - ETA: 13s
 63201280/220193953 [=======>......................] - ETA: 13s
 63791104/220193953 [=======>......................] - ETA: 13s
 64380928/220193953 [=======>......................] - ETA: 13s
 64970752/220193953 [=======>......................] - ETA: 13s
 65560576/220193953 [=======>......................] - ETA: 13s
 66150400/220193953 [========>.....................] - ETA: 13s
 66674688/220193953 [========>.....................] - ETA: 13s
 66772992/220193953 [========>.....................] - ETA: 13s
 68395008/220193953 [========>.....................] - ETA: 13s
 68689920/220193953 [========>.....................] - ETA: 13s
 68804608/220193953 [========>.....................] - ETA: 13s
 68820992/220193953 [========>.....................] - ETA: 13s
 70262784/220193953 [========>.....................] - ETA: 13s
 70852608/220193953 [========>.....................] - ETA: 13s
 71442432/220193953 [========>.....................] - ETA: 13s
 72015872/220193953 [========>.....................] - ETA: 13s
 72605696/220193953 [========>.....................] - ETA: 13s
 73195520/220193953 [========>.....................] - ETA: 13s
 73752576/220193953 [=========>....................] - ETA: 13s
 74014720/220193953 [=========>....................] - ETA: 13s
 74506240/220193953 [=========>....................] - ETA: 13s
 75014144/220193953 [=========>....................] - ETA: 13s
 75587584/220193953 [=========>....................] - ETA: 13s
 76177408/220193953 [=========>....................] - ETA: 13s
 76767232/220193953 [=========>....................] - ETA: 12s
 77324288/220193953 [=========>....................] - ETA: 12s
 77914112/220193953 [=========>....................] - ETA: 12s
 78471168/220193953 [=========>....................] - ETA: 12s
 79060992/220193953 [=========>....................] - ETA: 12s
 79650816/220193953 [=========>....................] - ETA: 12s
 80191488/220193953 [=========>....................] - ETA: 12s
 80683008/220193953 [=========>....................] - ETA: 12s
 81240064/220193953 [==========>...................] - ETA: 12s
 81829888/220193953 [==========>...................] - ETA: 12s
 82419712/220193953 [==========>...................] - ETA: 12s
 83009536/220193953 [==========>...................] - ETA: 12s
 83599360/220193953 [==========>...................] - ETA: 12s
 84189184/220193953 [==========>...................] - ETA: 12s
 84779008/220193953 [==========>...................] - ETA: 12s
 85368832/220193953 [==========>...................] - ETA: 12s
 85958656/220193953 [==========>...................] - ETA: 12s
 86548480/220193953 [==========>...................] - ETA: 12s
 86925312/220193953 [==========>...................] - ETA: 12s
 87351296/220193953 [==========>...................] - ETA: 12s
 87678976/220193953 [==========>...................] - ETA: 12s
 88268800/220193953 [===========>..................] - ETA: 11s
 88809472/220193953 [===========>..................] - ETA: 11s
 89399296/220193953 [===========>..................] - ETA: 11s
 89989120/220193953 [===========>..................] - ETA: 11s
 90578944/220193953 [===========>..................] - ETA: 11s
 91168768/220193953 [===========>..................] - ETA: 11s
 91758592/220193953 [===========>..................] - ETA: 11s
 92348416/220193953 [===========>..................] - ETA: 11s
 92938240/220193953 [===========>..................] - ETA: 11s
 93528064/220193953 [===========>..................] - ETA: 11s
 94117888/220193953 [===========>..................] - ETA: 11s
 94707712/220193953 [===========>..................] - ETA: 11s
 95002624/220193953 [===========>..................] - ETA: 11s
 95477760/220193953 [============>.................] - ETA: 11s
 95789056/220193953 [============>.................] - ETA: 11s
 96362496/220193953 [============>.................] - ETA: 11s
 96755712/220193953 [============>.................] - ETA: 11s
 97198080/220193953 [============>.................] - ETA: 11s
 97656832/220193953 [============>.................] - ETA: 11s
 98148352/220193953 [============>.................] - ETA: 11s
 98738176/220193953 [============>.................] - ETA: 11s
 99328000/220193953 [============>.................] - ETA: 11s
 99917824/220193953 [============>.................] - ETA: 10s
100507648/220193953 [============>.................] - ETA: 10s
101097472/220193953 [============>.................] - ETA: 10s
101523456/220193953 [============>.................] - ETA: 10s
102031360/220193953 [============>.................] - ETA: 10s
102604800/220193953 [============>.................] - ETA: 10s
103178240/220193953 [=============>................] - ETA: 10s
103768064/220193953 [=============>................] - ETA: 10s
104357888/220193953 [=============>................] - ETA: 10s
104947712/220193953 [=============>................] - ETA: 10s
105537536/220193953 [=============>................] - ETA: 10s
106127360/220193953 [=============>................] - ETA: 10s
106717184/220193953 [=============>................] - ETA: 10s
107307008/220193953 [=============>................] - ETA: 10s
107896832/220193953 [=============>................] - ETA: 10s
108486656/220193953 [=============>................] - ETA: 10s
109076480/220193953 [=============>................] - ETA: 10s
109486080/220193953 [=============>................] - ETA: 10s
109879296/220193953 [=============>................] - ETA: 10s
110239744/220193953 [==============>...............] - ETA: 10s
110583808/220193953 [==============>...............] - ETA: 10s
110944256/220193953 [==============>...............] - ETA: 10s
111304704/220193953 [==============>...............] - ETA: 10s
111828992/220193953 [==============>...............] - ETA: 9s 
112418816/220193953 [==============>...............] - ETA: 9s
113008640/220193953 [==============>...............] - ETA: 9s
113598464/220193953 [==============>...............] - ETA: 9s
114188288/220193953 [==============>...............] - ETA: 9s
114778112/220193953 [==============>...............] - ETA: 9s
115367936/220193953 [==============>...............] - ETA: 9s
115957760/220193953 [==============>...............] - ETA: 9s
116547584/220193953 [==============>...............] - ETA: 9s
117137408/220193953 [==============>...............] - ETA: 9s
117710848/220193953 [===============>..............] - ETA: 9s
118300672/220193953 [===============>..............] - ETA: 9s
118890496/220193953 [===============>..............] - ETA: 9s
119480320/220193953 [===============>..............] - ETA: 9s
120053760/220193953 [===============>..............] - ETA: 9s
120610816/220193953 [===============>..............] - ETA: 9s
121167872/220193953 [===============>..............] - ETA: 9s
121724928/220193953 [===============>..............] - ETA: 9s
122281984/220193953 [===============>..............] - ETA: 8s
122822656/220193953 [===============>..............] - ETA: 8s
123379712/220193953 [===============>..............] - ETA: 8s
123936768/220193953 [===============>..............] - ETA: 8s
124493824/220193953 [===============>..............] - ETA: 8s
125050880/220193953 [================>.............] - ETA: 8s
125624320/220193953 [================>.............] - ETA: 8s
126214144/220193953 [================>.............] - ETA: 8s
126803968/220193953 [================>.............] - ETA: 8s
127377408/220193953 [================>.............] - ETA: 8s
127967232/220193953 [================>.............] - ETA: 8s
128557056/220193953 [================>.............] - ETA: 8s
129146880/220193953 [================>.............] - ETA: 8s
129736704/220193953 [================>.............] - ETA: 8s
130326528/220193953 [================>.............] - ETA: 8s
130916352/220193953 [================>.............] - ETA: 8s
131506176/220193953 [================>.............] - ETA: 8s
132096000/220193953 [================>.............] - ETA: 8s
132685824/220193953 [=================>............] - ETA: 7s
133275648/220193953 [=================>............] - ETA: 7s
133865472/220193953 [=================>............] - ETA: 7s
134438912/220193953 [=================>............] - ETA: 7s
135028736/220193953 [=================>............] - ETA: 7s
135618560/220193953 [=================>............] - ETA: 7s
136192000/220193953 [=================>............] - ETA: 7s
136781824/220193953 [=================>............] - ETA: 7s
137355264/220193953 [=================>............] - ETA: 7s
137945088/220193953 [=================>............] - ETA: 7s
138534912/220193953 [=================>............] - ETA: 7s
139124736/220193953 [=================>............] - ETA: 7s
139714560/220193953 [==================>...........] - ETA: 7s
140304384/220193953 [==================>...........] - ETA: 7s
140894208/220193953 [==================>...........] - ETA: 7s
141500416/220193953 [==================>...........] - ETA: 7s
142090240/220193953 [==================>...........] - ETA: 7s
142680064/220193953 [==================>...........] - ETA: 7s
143269888/220193953 [==================>...........] - ETA: 6s
143843328/220193953 [==================>...........] - ETA: 6s
144433152/220193953 [==================>...........] - ETA: 6s
145022976/220193953 [==================>...........] - ETA: 6s
145612800/220193953 [==================>...........] - ETA: 6s
146202624/220193953 [==================>...........] - ETA: 6s
146792448/220193953 [==================>...........] - ETA: 6s
147382272/220193953 [===================>..........] - ETA: 6s
147972096/220193953 [===================>..........] - ETA: 6s
148561920/220193953 [===================>..........] - ETA: 6s
149151744/220193953 [===================>..........] - ETA: 6s
149741568/220193953 [===================>..........] - ETA: 6s
150331392/220193953 [===================>..........] - ETA: 6s
150921216/220193953 [===================>..........] - ETA: 6s
151511040/220193953 [===================>..........] - ETA: 6s
152100864/220193953 [===================>..........] - ETA: 6s
152690688/220193953 [===================>..........] - ETA: 6s
153280512/220193953 [===================>..........] - ETA: 6s
153870336/220193953 [===================>..........] - ETA: 5s
154460160/220193953 [====================>.........] - ETA: 5s
155049984/220193953 [====================>.........] - ETA: 5s
155639808/220193953 [====================>.........] - ETA: 5s
156229632/220193953 [====================>.........] - ETA: 5s
156819456/220193953 [====================>.........] - ETA: 5s
157409280/220193953 [====================>.........] - ETA: 5s
157999104/220193953 [====================>.........] - ETA: 5s
158588928/220193953 [====================>.........] - ETA: 5s
159178752/220193953 [====================>.........] - ETA: 5s
159768576/220193953 [====================>.........] - ETA: 5s
160358400/220193953 [====================>.........] - ETA: 5s
160948224/220193953 [====================>.........] - ETA: 5s
161538048/220193953 [=====================>........] - ETA: 5s
162127872/220193953 [=====================>........] - ETA: 5s
162717696/220193953 [=====================>........] - ETA: 5s
163307520/220193953 [=====================>........] - ETA: 5s
163897344/220193953 [=====================>........] - ETA: 5s
164487168/220193953 [=====================>........] - ETA: 5s
165076992/220193953 [=====================>........] - ETA: 4s
165666816/220193953 [=====================>........] - ETA: 4s
166256640/220193953 [=====================>........] - ETA: 4s
166846464/220193953 [=====================>........] - ETA: 4s
167436288/220193953 [=====================>........] - ETA: 4s
168026112/220193953 [=====================>........] - ETA: 4s
168615936/220193953 [=====================>........] - ETA: 4s
169205760/220193953 [======================>.......] - ETA: 4s
169795584/220193953 [======================>.......] - ETA: 4s
170385408/220193953 [======================>.......] - ETA: 4s
170975232/220193953 [======================>.......] - ETA: 4s
171565056/220193953 [======================>.......] - ETA: 4s
172154880/220193953 [======================>.......] - ETA: 4s
172744704/220193953 [======================>.......] - ETA: 4s
173334528/220193953 [======================>.......] - ETA: 4s
173924352/220193953 [======================>.......] - ETA: 4s
174514176/220193953 [======================>.......] - ETA: 4s
175120384/220193953 [======================>.......] - ETA: 4s
175710208/220193953 [======================>.......] - ETA: 3s
176300032/220193953 [=======================>......] - ETA: 3s
176889856/220193953 [=======================>......] - ETA: 3s
177479680/220193953 [=======================>......] - ETA: 3s
178069504/220193953 [=======================>......] - ETA: 3s
178659328/220193953 [=======================>......] - ETA: 3s
179249152/220193953 [=======================>......] - ETA: 3s
179838976/220193953 [=======================>......] - ETA: 3s
180428800/220193953 [=======================>......] - ETA: 3s
181018624/220193953 [=======================>......] - ETA: 3s
181608448/220193953 [=======================>......] - ETA: 3s
182198272/220193953 [=======================>......] - ETA: 3s
182788096/220193953 [=======================>......] - ETA: 3s
183377920/220193953 [=======================>......] - ETA: 3s
183967744/220193953 [========================>.....] - ETA: 3s
184557568/220193953 [========================>.....] - ETA: 3s
185147392/220193953 [========================>.....] - ETA: 3s
185737216/220193953 [========================>.....] - ETA: 3s
186327040/220193953 [========================>.....] - ETA: 3s
186916864/220193953 [========================>.....] - ETA: 2s
187506688/220193953 [========================>.....] - ETA: 2s
188096512/220193953 [========================>.....] - ETA: 2s
188686336/220193953 [========================>.....] - ETA: 2s
189276160/220193953 [========================>.....] - ETA: 2s
189849600/220193953 [========================>.....] - ETA: 2s
189997056/220193953 [========================>.....] - ETA: 2s
190128128/220193953 [========================>.....] - ETA: 2s
191258624/220193953 [=========================>....] - ETA: 2s
191389696/220193953 [=========================>....] - ETA: 2s
191913984/220193953 [=========================>....] - ETA: 2s
192454656/220193953 [=========================>....] - ETA: 2s
193044480/220193953 [=========================>....] - ETA: 2s
193634304/220193953 [=========================>....] - ETA: 2s
194224128/220193953 [=========================>....] - ETA: 2s
194813952/220193953 [=========================>....] - ETA: 2s
195403776/220193953 [=========================>....] - ETA: 2s
195993600/220193953 [=========================>....] - ETA: 2s
196583424/220193953 [=========================>....] - ETA: 2s
197173248/220193953 [=========================>....] - ETA: 2s
197763072/220193953 [=========================>....] - ETA: 2s
198352896/220193953 [==========================>...] - ETA: 1s
198942720/220193953 [==========================>...] - ETA: 1s
199532544/220193953 [==========================>...] - ETA: 1s
200122368/220193953 [==========================>...] - ETA: 1s
200712192/220193953 [==========================>...] - ETA: 1s
201302016/220193953 [==========================>...] - ETA: 1s
201891840/220193953 [==========================>...] - ETA: 1s
202481664/220193953 [==========================>...] - ETA: 1s
203071488/220193953 [==========================>...] - ETA: 1s
203661312/220193953 [==========================>...] - ETA: 1s
204234752/220193953 [==========================>...] - ETA: 1s
204824576/220193953 [==========================>...] - ETA: 1s
205414400/220193953 [==========================>...] - ETA: 1s
205987840/220193953 [===========================>..] - ETA: 1s
206561280/220193953 [===========================>..] - ETA: 1s
207118336/220193953 [===========================>..] - ETA: 1s
207691776/220193953 [===========================>..] - ETA: 1s
208281600/220193953 [===========================>..] - ETA: 1s
208871424/220193953 [===========================>..] - ETA: 1s
209461248/220193953 [===========================>..] - ETA: 0s
210051072/220193953 [===========================>..] - ETA: 0s
210624512/220193953 [===========================>..] - ETA: 0s
211181568/220193953 [===========================>..] - ETA: 0s
211755008/220193953 [===========================>..] - ETA: 0s
212344832/220193953 [===========================>..] - ETA: 0s
212934656/220193953 [============================>.] - ETA: 0s
213508096/220193953 [============================>.] - ETA: 0s
214065152/220193953 [============================>.] - ETA: 0s
214605824/220193953 [============================>.] - ETA: 0s
215146496/220193953 [============================>.] - ETA: 0s
215719936/220193953 [============================>.] - ETA: 0s
216309760/220193953 [============================>.] - ETA: 0s
216850432/220193953 [============================>.] - ETA: 0s
217440256/220193953 [============================>.] - ETA: 0s
218030080/220193953 [============================>.] - ETA: 0s
218603520/220193953 [============================>.] - ETA: 0s
219209728/220193953 [============================>.] - ETA: 0s
219799552/220193953 [============================>.] - ETA: 0s
220193953/220193953 [==============================] - 20s 0us/step
Loaded VOC2007 test data for car and person classes: 2500 images.

Anchors can also be computed easily using YOLO toolkit.

Note

The following code is given as an example. In a real use case scenario, anchors are computed on the training dataset.

from akida_models.detection.generate_anchors import generate_anchors

num_anchors = 5
grid_size = (7, 7)
anchors_example = generate_anchors(val_dataset, num_anchors, grid_size)
Average IOU for 5 anchors: 0.62
Anchors:  [[0.55213, 1.13669], [1.39658, 2.04627], [1.53616, 4.24362], [3.04205, 4.73053], [5.55387, 5.45918]]

3. Model architecture

The model zoo contains a YOLO model that is built upon the AkidaNet architecture and 3 separable convolutional layers at the top for bounding box and class estimation followed by a final separable convolutional which is the detection layer. Note that for efficiency, the alpha parameter in AkidaNet (network width or number of filter in each layer) is set to 0.5.

from akida_models import yolo_base

# Create a yolo model for 2 classes with 5 anchors and grid size of 7
classes = 2

model = yolo_base(input_shape=(224, 224, 3),
                  classes=classes,
                  nb_box=num_anchors,
                  alpha=0.5)
model.summary()
Model: "yolo_base"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input (InputLayer)          [(None, 224, 224, 3)]     0

 rescaling (Rescaling)       (None, 224, 224, 3)       0

 conv_0 (Conv2D)             (None, 112, 112, 16)      432

 conv_0/BN (BatchNormalizati  (None, 112, 112, 16)     64
 on)

 conv_0/relu (ReLU)          (None, 112, 112, 16)      0

 conv_1 (Conv2D)             (None, 112, 112, 32)      4608

 conv_1/BN (BatchNormalizati  (None, 112, 112, 32)     128
 on)

 conv_1/relu (ReLU)          (None, 112, 112, 32)      0

 conv_2 (Conv2D)             (None, 56, 56, 64)        18432

 conv_2/BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_2/relu (ReLU)          (None, 56, 56, 64)        0

 conv_3 (Conv2D)             (None, 56, 56, 64)        36864

 conv_3/BN (BatchNormalizati  (None, 56, 56, 64)       256
 on)

 conv_3/relu (ReLU)          (None, 56, 56, 64)        0

 dw_separable_4 (DepthwiseCo  (None, 28, 28, 64)       576
 nv2D)

 pw_separable_4 (Conv2D)     (None, 28, 28, 128)       8192

 pw_separable_4/BN (BatchNor  (None, 28, 28, 128)      512
 malization)

 pw_separable_4/relu (ReLU)  (None, 28, 28, 128)       0

 dw_separable_5 (DepthwiseCo  (None, 28, 28, 128)      1152
 nv2D)

 pw_separable_5 (Conv2D)     (None, 28, 28, 128)       16384

 pw_separable_5/BN (BatchNor  (None, 28, 28, 128)      512
 malization)

 pw_separable_5/relu (ReLU)  (None, 28, 28, 128)       0

 dw_separable_6 (DepthwiseCo  (None, 14, 14, 128)      1152
 nv2D)

 pw_separable_6 (Conv2D)     (None, 14, 14, 256)       32768

 pw_separable_6/BN (BatchNor  (None, 14, 14, 256)      1024
 malization)

 pw_separable_6/relu (ReLU)  (None, 14, 14, 256)       0

 dw_separable_7 (DepthwiseCo  (None, 14, 14, 256)      2304
 nv2D)

 pw_separable_7 (Conv2D)     (None, 14, 14, 256)       65536

 pw_separable_7/BN (BatchNor  (None, 14, 14, 256)      1024
 malization)

 pw_separable_7/relu (ReLU)  (None, 14, 14, 256)       0

 dw_separable_8 (DepthwiseCo  (None, 14, 14, 256)      2304
 nv2D)

 pw_separable_8 (Conv2D)     (None, 14, 14, 256)       65536

 pw_separable_8/BN (BatchNor  (None, 14, 14, 256)      1024
 malization)

 pw_separable_8/relu (ReLU)  (None, 14, 14, 256)       0

 dw_separable_9 (DepthwiseCo  (None, 14, 14, 256)      2304
 nv2D)

 pw_separable_9 (Conv2D)     (None, 14, 14, 256)       65536

 pw_separable_9/BN (BatchNor  (None, 14, 14, 256)      1024
 malization)

 pw_separable_9/relu (ReLU)  (None, 14, 14, 256)       0

 dw_separable_10 (DepthwiseC  (None, 14, 14, 256)      2304
 onv2D)

 pw_separable_10 (Conv2D)    (None, 14, 14, 256)       65536

 pw_separable_10/BN (BatchNo  (None, 14, 14, 256)      1024
 rmalization)

 pw_separable_10/relu (ReLU)  (None, 14, 14, 256)      0

 dw_separable_11 (DepthwiseC  (None, 14, 14, 256)      2304
 onv2D)

 pw_separable_11 (Conv2D)    (None, 14, 14, 256)       65536

 pw_separable_11/BN (BatchNo  (None, 14, 14, 256)      1024
 rmalization)

 pw_separable_11/relu (ReLU)  (None, 14, 14, 256)      0

 dw_separable_12 (DepthwiseC  (None, 7, 7, 256)        2304
 onv2D)

 pw_separable_12 (Conv2D)    (None, 7, 7, 512)         131072

 pw_separable_12/BN (BatchNo  (None, 7, 7, 512)        2048
 rmalization)

 pw_separable_12/relu (ReLU)  (None, 7, 7, 512)        0

 dw_separable_13 (DepthwiseC  (None, 7, 7, 512)        4608
 onv2D)

 pw_separable_13 (Conv2D)    (None, 7, 7, 512)         262144

 pw_separable_13/BN (BatchNo  (None, 7, 7, 512)        2048
 rmalization)

 pw_separable_13/relu (ReLU)  (None, 7, 7, 512)        0

 dw_1conv (DepthwiseConv2D)  (None, 7, 7, 512)         4608

 pw_1conv (Conv2D)           (None, 7, 7, 1024)        524288

 pw_1conv/BN (BatchNormaliza  (None, 7, 7, 1024)       4096
 tion)

 pw_1conv/relu (ReLU)        (None, 7, 7, 1024)        0

 dw_2conv (DepthwiseConv2D)  (None, 7, 7, 1024)        9216

 pw_2conv (Conv2D)           (None, 7, 7, 1024)        1048576

 pw_2conv/BN (BatchNormaliza  (None, 7, 7, 1024)       4096
 tion)

 pw_2conv/relu (ReLU)        (None, 7, 7, 1024)        0

 dw_3conv (DepthwiseConv2D)  (None, 7, 7, 1024)        9216

 pw_3conv (Conv2D)           (None, 7, 7, 1024)        1048576

 pw_3conv/BN (BatchNormaliza  (None, 7, 7, 1024)       4096
 tion)

 pw_3conv/relu (ReLU)        (None, 7, 7, 1024)        0

 dw_detection_layer (Depthwi  (None, 7, 7, 1024)       9216
 seConv2D)

 pw_detection_layer (Conv2D)  (None, 7, 7, 35)         35875

=================================================================
Total params: 3,573,715
Trainable params: 3,561,587
Non-trainable params: 12,128
_________________________________________________________________

The model output can be reshaped to a more natural shape of:

(grid_height, grid_width, anchors_box, 4 + 1 + num_classes)

where the “4 + 1” term represents the coordinates of the estimated bounding boxes (top left x, top left y, width and height) and a confidence score. In other words, the output channels are actually grouped by anchor boxes, and in each group one channel provides either a coordinate, a global confidence score or a class confidence score. This process is done automatically in the decode_output function.

from tensorflow.keras import Model
from tensorflow.keras.layers import Reshape

# Define a reshape output to be added to the YOLO model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model.output)

# Build the complete model
full_model = Model(model.input, output)
full_model.output
<KerasTensor: shape=(None, 7, 7, 5, 7) dtype=float32 (created by layer 'YOLO_output')>

4. Training

As the YOLO model relies on Brainchip AkidaNet/ImageNet network, it is possible to perform transfer learning from ImageNet pretrained weights when training a YOLO model. See the PlantVillage transfer learning example for a detail explanation on transfer learning principles.

5. Performance

The model zoo also contains an helper method that allows to create a YOLO model for VOC and load pretrained weights for the car and person detection task and the corresponding anchors. The anchors are used to interpret the model outputs.

The metric used to evaluate YOLO is the mean average precision (mAP) which is the percentage of correct prediction and is given for an intersection over union (IoU) ratio. Scores in this example are given for the standard IoU of 0.5 meaning that a detection is considered valid if the intersection over union ratio with its ground truth equivalent is above 0.5.

Note

A call to evaluate_map will preprocess the images, make the call to Model.predict and use decode_output before computing precision for all classes.

from timeit import default_timer as timer
from akida_models import yolo_voc_pretrained
from akida_models.detection.map_evaluation import MapEvaluation

# Load the pretrained model along with anchors
model_keras, anchors = yolo_voc_pretrained()
model_keras.summary()
Downloading data from https://data.brainchip.com/dataset-mirror/voc/voc_anchors.pkl.

  0/126 [..............................] - ETA: 0s
126/126 [==============================] - 0s 2us/step
Downloading data from https://data.brainchip.com/models/AkidaV2/yolo/yolo_akidanet_voc_i8_w4_a4.h5.

       0/14557704 [..............................] - ETA: 0s
   98304/14557704 [..............................] - ETA: 7s
  598016/14557704 [>.............................] - ETA: 2s
 1179648/14557704 [=>............................] - ETA: 1s
 1785856/14557704 [==>...........................] - ETA: 1s
 2367488/14557704 [===>..........................] - ETA: 1s
 2957312/14557704 [=====>........................] - ETA: 1s
 3538944/14557704 [======>.......................] - ETA: 1s
 4112384/14557704 [=======>......................] - ETA: 1s
 4710400/14557704 [========>.....................] - ETA: 0s
 5292032/14557704 [=========>....................] - ETA: 0s
 5873664/14557704 [===========>..................] - ETA: 0s
 6471680/14557704 [============>.................] - ETA: 0s
 7053312/14557704 [=============>................] - ETA: 0s
 7634944/14557704 [==============>...............] - ETA: 0s
 8241152/14557704 [===============>..............] - ETA: 0s
 8814592/14557704 [=================>............] - ETA: 0s
 9404416/14557704 [==================>...........] - ETA: 0s
10010624/14557704 [===================>..........] - ETA: 0s
10592256/14557704 [====================>.........] - ETA: 0s
11182080/14557704 [======================>.......] - ETA: 0s
11771904/14557704 [=======================>......] - ETA: 0s
12148736/14557704 [========================>.....] - ETA: 0s
12279808/14557704 [========================>.....] - ETA: 0s
12328960/14557704 [========================>.....] - ETA: 0s
13615104/14557704 [===========================>..] - ETA: 0s
13746176/14557704 [===========================>..] - ETA: 0s
14057472/14557704 [===========================>..] - ETA: 0s
14557704/14557704 [==============================] - 1s 0us/step
Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input (InputLayer)          [(None, 224, 224, 3)]     0

 rescaling (QuantizedRescali  (None, 224, 224, 3)      0
 ng)

 conv_0 (QuantizedConv2D)    (None, 112, 112, 16)      448

 conv_0/relu (QuantizedReLU)  (None, 112, 112, 16)     32

 conv_1 (QuantizedConv2D)    (None, 112, 112, 32)      4640

 conv_1/relu (QuantizedReLU)  (None, 112, 112, 32)     64

 conv_2 (QuantizedConv2D)    (None, 56, 56, 64)        18496

 conv_2/relu (QuantizedReLU)  (None, 56, 56, 64)       128

 conv_3 (QuantizedConv2D)    (None, 56, 56, 64)        36928

 conv_3/relu (QuantizedReLU)  (None, 56, 56, 64)       128

 dw_separable_4 (QuantizedDe  (None, 28, 28, 64)       704
 pthwiseConv2D)

 pw_separable_4 (QuantizedCo  (None, 28, 28, 128)      8320
 nv2D)

 pw_separable_4/relu (Quanti  (None, 28, 28, 128)      256
 zedReLU)

 dw_separable_5 (QuantizedDe  (None, 28, 28, 128)      1408
 pthwiseConv2D)

 pw_separable_5 (QuantizedCo  (None, 28, 28, 128)      16512
 nv2D)

 pw_separable_5/relu (Quanti  (None, 28, 28, 128)      256
 zedReLU)

 dw_separable_6 (QuantizedDe  (None, 14, 14, 128)      1408
 pthwiseConv2D)

 pw_separable_6 (QuantizedCo  (None, 14, 14, 256)      33024
 nv2D)

 pw_separable_6/relu (Quanti  (None, 14, 14, 256)      512
 zedReLU)

 dw_separable_7 (QuantizedDe  (None, 14, 14, 256)      2816
 pthwiseConv2D)

 pw_separable_7 (QuantizedCo  (None, 14, 14, 256)      65792
 nv2D)

 pw_separable_7/relu (Quanti  (None, 14, 14, 256)      512
 zedReLU)

 dw_separable_8 (QuantizedDe  (None, 14, 14, 256)      2816
 pthwiseConv2D)

 pw_separable_8 (QuantizedCo  (None, 14, 14, 256)      65792
 nv2D)

 pw_separable_8/relu (Quanti  (None, 14, 14, 256)      512
 zedReLU)

 dw_separable_9 (QuantizedDe  (None, 14, 14, 256)      2816
 pthwiseConv2D)

 pw_separable_9 (QuantizedCo  (None, 14, 14, 256)      65792
 nv2D)

 pw_separable_9/relu (Quanti  (None, 14, 14, 256)      512
 zedReLU)

 dw_separable_10 (QuantizedD  (None, 14, 14, 256)      2816
 epthwiseConv2D)

 pw_separable_10 (QuantizedC  (None, 14, 14, 256)      65792
 onv2D)

 pw_separable_10/relu (Quant  (None, 14, 14, 256)      512
 izedReLU)

 dw_separable_11 (QuantizedD  (None, 14, 14, 256)      2816
 epthwiseConv2D)

 pw_separable_11 (QuantizedC  (None, 14, 14, 256)      65792
 onv2D)

 pw_separable_11/relu (Quant  (None, 14, 14, 256)      512
 izedReLU)

 dw_separable_12 (QuantizedD  (None, 7, 7, 256)        2816
 epthwiseConv2D)

 pw_separable_12 (QuantizedC  (None, 7, 7, 512)        131584
 onv2D)

 pw_separable_12/relu (Quant  (None, 7, 7, 512)        1024
 izedReLU)

 dw_separable_13 (QuantizedD  (None, 7, 7, 512)        5632
 epthwiseConv2D)

 pw_separable_13 (QuantizedC  (None, 7, 7, 512)        262656
 onv2D)

 pw_separable_13/relu (Quant  (None, 7, 7, 512)        1024
 izedReLU)

 dw_1conv (QuantizedDepthwis  (None, 7, 7, 512)        5632
 eConv2D)

 pw_1conv (QuantizedConv2D)  (None, 7, 7, 1024)        525312

 pw_1conv/relu (QuantizedReL  (None, 7, 7, 1024)       2048
 U)

 dw_2conv (QuantizedDepthwis  (None, 7, 7, 1024)       11264
 eConv2D)

 pw_2conv (QuantizedConv2D)  (None, 7, 7, 1024)        1049600

 pw_2conv/relu (QuantizedReL  (None, 7, 7, 1024)       2048
 U)

 dw_3conv (QuantizedDepthwis  (None, 7, 7, 1024)       11264
 eConv2D)

 pw_3conv (QuantizedConv2D)  (None, 7, 7, 1024)        1049600

 pw_3conv/relu (QuantizedReL  (None, 7, 7, 1024)       2048
 U)

 dw_detection_layer (Quantiz  (None, 7, 7, 1024)       11264
 edDepthwiseConv2D)

 pw_detection_layer (Quantiz  (None, 7, 7, 35)         35875
 edConv2D)

 dequantizer (Dequantizer)   (None, 7, 7, 35)          0

=================================================================
Total params: 3,579,555
Trainable params: 3,555,523
Non-trainable params: 24,032
_________________________________________________________________
# Define the final reshape and build the model
output = Reshape((grid_size[1], grid_size[0], num_anchors, 4 + 1 + classes),
                 name="YOLO_output")(model_keras.output)
model_keras = Model(model_keras.input, output)

# Create the mAP evaluator object
num_images = 100

map_evaluator = MapEvaluation(model_keras, val_dataset.take(num_images),
                              num_images, labels, anchors)

# Compute the scores for all validation images
start = timer()
mAP, average_precisions = map_evaluator.evaluate_map()
end = timer()

for label, average_precision in average_precisions.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP))
print(f'Keras inference on {num_images} images took {end-start:.2f} s.\n')
car 0.5109
person 0.4080
mAP: 0.4595
Keras inference on 100 images took 15.95 s.

6. Conversion to Akida

6.1 Convert to Akida model

The last YOLO_output layer that was added for splitting channels into values for each box must be removed before Akida conversion.

# Rebuild a model without the last layer
compatible_model = Model(model_keras.input, model_keras.layers[-2].output)

When converting to an Akida model, we just need to pass the Keras model to cnn2snn.convert.

from cnn2snn import convert

model_akida = convert(compatible_model)
model_akida.summary()
                 Model Summary
________________________________________________
Input shape    Output shape  Sequences  Layers
================================================
[224, 224, 3]  [7, 7, 35]    1          33
________________________________________________

__________________________________________________________________________
Layer (type)                          Output shape    Kernel shape

==================== SW/conv_0-dequantizer (Software) ====================

conv_0 (InputConv2D)                  [112, 112, 16]  (3, 3, 3, 16)
__________________________________________________________________________
conv_1 (Conv2D)                       [112, 112, 32]  (3, 3, 16, 32)
__________________________________________________________________________
conv_2 (Conv2D)                       [56, 56, 64]    (3, 3, 32, 64)
__________________________________________________________________________
conv_3 (Conv2D)                       [56, 56, 64]    (3, 3, 64, 64)
__________________________________________________________________________
dw_separable_4 (DepthwiseConv2D)      [28, 28, 64]    (3, 3, 64, 1)
__________________________________________________________________________
pw_separable_4 (Conv2D)               [28, 28, 128]   (1, 1, 64, 128)
__________________________________________________________________________
dw_separable_5 (DepthwiseConv2D)      [28, 28, 128]   (3, 3, 128, 1)
__________________________________________________________________________
pw_separable_5 (Conv2D)               [28, 28, 128]   (1, 1, 128, 128)
__________________________________________________________________________
dw_separable_6 (DepthwiseConv2D)      [14, 14, 128]   (3, 3, 128, 1)
__________________________________________________________________________
pw_separable_6 (Conv2D)               [14, 14, 256]   (1, 1, 128, 256)
__________________________________________________________________________
dw_separable_7 (DepthwiseConv2D)      [14, 14, 256]   (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_7 (Conv2D)               [14, 14, 256]   (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_8 (DepthwiseConv2D)      [14, 14, 256]   (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_8 (Conv2D)               [14, 14, 256]   (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_9 (DepthwiseConv2D)      [14, 14, 256]   (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_9 (Conv2D)               [14, 14, 256]   (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_10 (DepthwiseConv2D)     [14, 14, 256]   (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_10 (Conv2D)              [14, 14, 256]   (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_11 (DepthwiseConv2D)     [14, 14, 256]   (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_11 (Conv2D)              [14, 14, 256]   (1, 1, 256, 256)
__________________________________________________________________________
dw_separable_12 (DepthwiseConv2D)     [7, 7, 256]     (3, 3, 256, 1)
__________________________________________________________________________
pw_separable_12 (Conv2D)              [7, 7, 512]     (1, 1, 256, 512)
__________________________________________________________________________
dw_separable_13 (DepthwiseConv2D)     [7, 7, 512]     (3, 3, 512, 1)
__________________________________________________________________________
pw_separable_13 (Conv2D)              [7, 7, 512]     (1, 1, 512, 512)
__________________________________________________________________________
dw_1conv (DepthwiseConv2D)            [7, 7, 512]     (3, 3, 512, 1)
__________________________________________________________________________
pw_1conv (Conv2D)                     [7, 7, 1024]    (1, 1, 512, 1024)
__________________________________________________________________________
dw_2conv (DepthwiseConv2D)            [7, 7, 1024]    (3, 3, 1024, 1)
__________________________________________________________________________
pw_2conv (Conv2D)                     [7, 7, 1024]    (1, 1, 1024, 1024)
__________________________________________________________________________
dw_3conv (DepthwiseConv2D)            [7, 7, 1024]    (3, 3, 1024, 1)
__________________________________________________________________________
pw_3conv (Conv2D)                     [7, 7, 1024]    (1, 1, 1024, 1024)
__________________________________________________________________________
dw_detection_layer (DepthwiseConv2D)  [7, 7, 1024]    (3, 3, 1024, 1)
__________________________________________________________________________
pw_detection_layer (Conv2D)           [7, 7, 35]      (1, 1, 1024, 35)
__________________________________________________________________________
dequantizer (Dequantizer)             [7, 7, 35]      N/A
__________________________________________________________________________

6.1 Check performance

Akida model accuracy is tested on the first n images of the validation set.

# Create the mAP evaluator object
map_evaluator_ak = MapEvaluation(model_akida,
                                 val_dataset.take(num_images),
                                 num_images,
                                 labels,
                                 anchors,
                                 is_keras_model=False)

# Compute the scores for all validation images
start = timer()
mAP_ak, average_precisions_ak = map_evaluator_ak.evaluate_map()
end = timer()

for label, average_precision in average_precisions_ak.items():
    print(labels[label], '{:.4f}'.format(average_precision))
print('mAP: {:.4f}'.format(mAP_ak))
print(f'Akida inference on {num_images} images took {end-start:.2f} s.\n')
car 0.5147
person 0.3945
mAP: 0.4546
Akida inference on 100 images took 14.61 s.

6.2 Show predictions for a random image

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches

from akida_models.detection.processing import preprocess_image, decode_output

# Shuffle the data to take a random test image
val_dataset = val_dataset.shuffle(buffer_size=num_images)

input_shape = model_akida.layers[0].input_dims

# Load the image
raw_image = next(iter(val_dataset))['image']

# Keep the original image size for later bounding boxes rescaling
raw_height, raw_width, _ = raw_image.shape

# Pre-process the image
image = preprocess_image(raw_image, input_shape)
input_image = image[np.newaxis, :].astype(np.uint8)

# Call evaluate on the image
pots = model_akida.predict(input_image)[0]

# Reshape the potentials to prepare for decoding
h, w, c = pots.shape
pots = pots.reshape((h, w, len(anchors), 4 + 1 + len(labels)))

# Decode potentials into bounding boxes
raw_boxes = decode_output(pots, anchors, len(labels))

# Rescale boxes to the original image size
pred_boxes = np.array([[
    box.x1 * raw_width, box.y1 * raw_height, box.x2 * raw_width,
    box.y2 * raw_height,
    box.get_label(),
    box.get_score()
] for box in raw_boxes])

fig = plt.figure(num='VOC2012 car and person detection by Akida')
ax = fig.subplots(1)
img_plot = ax.imshow(np.zeros(raw_image.shape, dtype=np.uint8))
img_plot.set_data(raw_image)

for box in pred_boxes:
    rect = patches.Rectangle((box[0], box[1]),
                             box[2] - box[0],
                             box[3] - box[1],
                             linewidth=1,
                             edgecolor='r',
                             facecolor='none')
    ax.add_patch(rect)
    class_score = ax.text(box[0],
                          box[1] - 5,
                          f"{labels[int(box[4])]} - {box[5]:.2f}",
                          color='red')

plt.axis('off')
plt.show()
plot 5 voc yolo detection

Total running time of the script: (1 minutes 15.124 seconds)

Gallery generated by Sphinx-Gallery