Akida user guide
Like many other machine learning frameworks, the core data structures of Akida are layers and models, and users familiar with Keras, Tensorflow or Pytorch should be on familiar grounds.
The main difference between Akida and other machine learning framework is that instead of modeling traditional Artificial Neural Networks, Akida models aim at representing Spiking Neural Networks, i.e. interconnected graphs of neurons that fire when their potential reaches a predefined threshold.
On another note, unlike other frameworks, Akida layers only use integer inputs, outputs and weights.
Concretely, Akida layers can be represented by the combination of standard machine learning layers into computation blocks:
a Convolutional or Dense layer to evaluate the Spiking Neuron Potential,
the addition of an inverted bias to represent the firing threshold,
a ReLu activation to represent the neuron spike.
Three principal layer types are available:
FullyConnected – sometimes described as ‘dense’
Convolutional – or ‘weight-sharing’
SeparableConvolutional, - a less computationally intensive convolutional layer.
The weights of Akida layers are N-bit integer: please refer to the hardware constraints for details of the supported bitwidth for each layer.
Akida inputs and outputs are 4-dimensional tensors whose first dimension is the index of a specific sample.
The inputs of Akida layers are N-bit integer: please refer to the hardware constraints for details of the supported bitwidth for each layer.
A versatile machine learning framework
The Akida machine learning framework supports two main types of models:
native SNN models and,
deep-learning SNN models.
Native Spiking Neural Networks
Native SNN models are typically composed of a few Dense layers. They require most of the time a specific feature extractor to feed the Dense layers. This feature extractor can either be completely external to the model, or be a deep-learning SNN submodel as defined below.
Deep-learning Spiking Neural Networks
Deep-learning SNN models are genuine CNN models converted to Akida SNN models.
As a consequence, deep-learning professionals do not need to learn any new framework to start using Akida: they can simply craft their models in TensorFlow/Keras and convert them to Akida SNN models using the CNN2SNN seamless conversion tool.
Unlike genuine CNN, deep-learning SNN cannot be trained online using back-propagation: for deep models where online learning is required, it is therefore recommended to import the weights of early layers from a pre-trained CNN, and to apply Akida Edge learning only on the last layer.
The Sequential model
Specifying the model
Akida models are defined using the sequential API.
This comprises creating a
Model object and adding layers to it using the
Layers are built with a name and a list of named parameters that are described in the sections below.
Example of sequential definition of a model:
from akida import Model, InputData, FullyConnected model = Model() model.add(InputData(name="input", input_shape=(32, 32, 1))) model.add(FullyConnected(name="fully", units=32, threshold=40))
method prints a description of the model architecture.
Accessing layer parameters and weights
The layers of a
Model can be accessed either by their index or by their
first_layer = model.layers fc = model.get_layer("fully")
Each layer type has a different set of attributes, available through the
fc = model.get_layer("fully") n = fc.parameters.units fc.parameters.weights_bits = 2
Some layer types also have variables containing weights and thresholds:
fc = model.get_layer("fully") weights = fc.variables["weights"] weights[0, 0, 0, 0] = 1 fc.variables["weights"] = weights
method allows to infer the outputs of a specific set of inputs.
Like inference methods in other machine learning frameworks, it simply returns the integer potentials or activations of the last layer.
import numpy as np ... # Prepare one sample input_shape = (1,) + tuple(model.input_shape) inputs = np.ones(input_shape, dtype=np.uint8) # Inference outputs = model.forward(inputs)
method is very similar to the forward method, but is specifically designed to
replicate the float outputs of a converted CNN: instead of the integer potentials,
it returns float values representing the integer potentials shifted and rescaled using
per-axis constants evaluated during the CNN conversion.
After an inference, the
Model .statistics member provides relevant inference statistics.
import numpy as np ... # Prepare one sample input_shape = (1,) + tuple(model.input_shape) inputs = np.ones(input_shape, dtype=np.uint8) # Inference outputs = model.evaluate(inputs) assert outputs.dtype == np.float32
Saving and loading
Model object can be saved to disk for future use with the
method that needs a path for the model.
The model will be saved as a file with an .fbz extension that describes its architecture and weights.
A saved model can be reloaded using the
Model object constructor with the
full path of saved file as a string argument. This will automatically load the
weights associated to the model.
model.save("demo_CharacterDVS.fbz") loaded_model = Model("demo_CharacterDVS.fbz")
Input layer types
The first layer of a model must be one of two possible input layer types:
Data-Processing layer types
After the input layer all subsequent layers will be data-processing layers.
Each layer contains several neurons that are connected to the layer inputs according to different topologies defined by the layer type. A weight is assigned to each connection, and that weight is combined with the input to modify the neuron potential.
When the neuron potentials have been evaluated, the layer feeds them to an activation function that may or may not emit a spike.
A data-processing layer can be one of three types:
FullyConnected – each neuron is connected to members of the full set of possible inputs – hence ‘fully connected’, even though a much smaller number of connections are likely to be non-zero.
Convolutional – each neuron’s connection weights express a localized filter – typically a region that is a small fraction of the input’s height and width. This filter is tested across all x and y positions.
The FullyConnected layers can be trained using the Akida Edge learning algorithm if they are the last layer of a model.
The Akida activation function uses a quantization scheme to evaluate the neuron
response when its potential goes beyond its firing threshold.
The intensity of the response is measured by dividing the difference between the
potential and the threshold in several quantization intervals that correspond to
a set of quantized spike values. The default quantization scheme is
whenever the neuron potential is above the threshold, a spike with a value of
one is emitted.
More generally, if we denote:
T the threshold,
s the length of a quantization interval,
p the neuron potential,
Q the quantized activation values.
T + n * s < p <= T + (n + 1)*s => response = Q[n]
All data-processing layers share the following activation parameters:
threshold: integer value which defines the threshold for neurons to fire or generate an event. When using binary weights and activations, the activation level of neurons cannot exceed the
act_bits: < one of
[1, 2, 4]> Defines the number of bits used to quantize the neuron response (defaults to one bit for binary). Quantized activations are integers in the range
[1, 2^(weights_bits) -1].
act_step: a float value, defining the length of the potential quantization intervals for act_bits = 4. For 2 bits, this is 1/4 of the length of the potentials intervals and it is not relevant for 1 bit.
pool_type = Average]
pool_size: tuple of integer values, sets the width and height of the patch used to perform the pooling. If not specified it performs a global pooling.
[optional] pool_type: PoolType Sets the effective pooling type (defaults to NoPooling):
NoPooling– no pooling.
Max– computing the maximum of each region.
Average– computing the average values of each region.
pool_stride: tuple of integer values, sets the horizontal and vertical strides applied when sliding the pooling patches. If not specified, a stride of
Model Hardware Mapping
By default, Akida models are implicitly mapped on a software backend: in other words, their inference is computed on the host CPU.
In order to perform the inference of a model on hardware, the corresponding
Model object must first be mapped on a specific
Device object represents an Akida device, which is entirely
Discovering Hardware Devices
The list of hardware devices detected on a specific host is available using the devices() method.
from akida import devices device = devices() print(device.version)
Most of the time,
Device objects are real hardware devices, but virtual
devices can also be created to allow the mapping of a
Model on a host that is
not connected to a hardware device.
Virtual devices are simply created by specifying their hardware revision and mesh topology:
from akida import Device, NSoC_v2 # Assuming mesh has been defined above device = Device(NSoC_v2, mesh)
Mapping a model on a specific device is as simple as calling the
When mapping a model on a device, the information related to the layers and related variables are processed in such way that the selected device can perform an inference. If the Model contains layers that are not hardware compatible or is too big to fit on the device, it will be split in multiple sequences.
The number of sequences, program size for each and how they are mapped are included in
Model .summary() output
after it has been mapped on a device.
Advanced Mapping Details and Hardware Devices Usage
Model .map() might create more
than one “sequence”. In this case, when inference methods are used, each sequence will be chain
loaded on the device to process the given input. Sequences can be obtained using the
property, that will return a list of sequence objects. The program used to load
one sequence can be obtained programmatically.
model.map(device) print(len(model.sequences)) # Assume there is at least one sequence. sequence = model.sequences # Check program size print(len(sequence.program))
The information found in the
Model .summary() can be used to
modify a model to make it fit into less sequences, and program size can be
used to estimate the flash and memory usage on an embedded system that would
use the device.
Once the model has been mapped, the sequences mapped in the Hardware run on the device, and the sequences mapped in the Software run on the CPU.
One can also force the model to be mapped as one sequence in the hardware device
only by setting the parameter
hw_only to True (by default the value is False).
See the .map() method API for more details.
Note: an exception will be raised if the Model cannot be mapped entirely on the device.
Once the model has been mapped, the inference happens only on the device, and not on the host CPU except for passing inputs and fetching outputs.
It is possible to retrieve fps and power performances when inference happens on a device.
Enabling power measurement is simply done by:
device.soc.power_measurement_enabled = True
After sending data for inference and performances measurements can be retrieved from the model statistics.
An example of power and fps performances is given in the AkidaNet/ImageNet tutorial.
Using Akida Edge learning
The Akida Edge learning is a unique feature of the Akida IP.
In this mode, an Akida Layer will typically be compiled with specific learning parameters and then undergo a period of feed-forward unsupervised or semi-supervised training by letting it process inputs generated by previous layers from a relevant dataset.
Once a layer has been compiled, new learning episodes can be resumed at any time, even after the model has been saved and reloaded.
Only the last layer of a model can be trained with Akida Edge Learning and must fulfill the following constraints:
must be of type FullyConnected,
must have binary weight,
must receive binary inputs.
Compiling a layer
For a layer to learn using Akida Edge Learning, it must first be compiled using
Model .compile method.
The following learning parameters can be specified when compiling a layer:
num_weights: integer value which defines the number of connections for each neuron and is constant across neurons. When determining a value for
num_weightsnote that the total number of available connections for a Convolutional layer is not set by the dimensions of the input to the layer, but by the dimensions of the kernel. Total connections =
num_featuresis typically the
unitsof the preceding layer.
num_weightsshould be much smaller than this value – not more than half, and often much less.
num_classes: integer value, representing the number of classes in the dataset. Defining this value sets the learning to a ‘labeled’ mode, when the layer is initialized. The neurons are divided into groups of equal size, one for each input data class. When an input packet is sent with a label included, only the neurons corresponding to that input class are allowed to learn.
initial_plasticity: floating point value, range 0–1 inclusive (defaults to 1). It defines the initial plasticity of each neuron’s connections or how easily the weights will change when learning occurs; similar in some ways to a learning rate. Typically, this can be set to 1, especially if the model is initialized with random weights. Plasticity can only decrease over time, never increase; if set to 0 learning will never occur in the model.
min_plasticity: floating point value, range 0–1 inclusive (defaults to 0.1). It defines the minimum level to which plasticity will decay.
plasticity_decay: floating point value, range 0–1 inclusive (defaults to 0.25). It defines the decay of plasticity with each learning step, relative to the
learning_competition: floating point value, range 0–1 inclusive (defaults to 0). It controls competition between neurons. This is a rather subtle parameter since there is always substantial competition in learning between neurons. This parameter controls the competition from neurons that have already learned – when set to zero, a neuron that has already learned a given feature will not prevent other neurons from learning similar features. As
learning_competitionincreases such neurons will exert more competition. This parameter can, however, have serious unintended consequences for learning stability; we recommend that it should be kept low, and probably never exceed 0.5.
The only mandatory parameter is the number of active (non-zero) connections that
each of the layer neurons has with the previous layer, expressed as the number
weights for each neuron.
Optimizing this value is key to achieving high accuracy in the Akida NSoC. Broadly speaking, the number of weights should be related to the number of events expected to compose the items’ or item’s sub-features of interest.
Tips to set Akida learning parameters are detailed in the dedicated example.