CNN2SNN Toolkit API
Akida version
- class cnn2snn.AkidaVersion(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]
- v1 =Akida 1.0 SOC and IP
- v2 =Akida 2.0 IP
- cnn2snn.get_akida_version()[source]
Get the target akida version for model conversion.
- Returns:
the target akida version, by default
AkidaVersion.v2
- Return type:
- cnn2snn.set_akida_version(version)[source]
Select the target akida version for model conversion.
- Parameters:
version (AkidaVersion) – the target Akida version.
Conversion
- cnn2snn.convert(model, file_path=None, input_scaling=None)[source]
Converts a Keras or ONNX quantized model to an Akida one.
This method is compatible with model quantized with
cnn2snn.quantize()
andquantizeml.quantize()
. To check the difference between the two conversion processes check the methods _convert_cnn2snn and _convert_quantizeml below.- Parameters:
model (
tf.keras.Model
oronnx.ModelProto
) – a model to convert.file_path (str, optional) – destination for the akida model. (Default value = None)
input_scaling (2 elements tuple, optional) – value of the input scaling. (Default value = None)
- Returns:
an Akida model.
- Return type:
- cnn2snn.check_model_compatibility(model, device=None, input_dtype='uint8')[source]
Checks that a float Keras or ONNX model is Akida compatible.
The process stops on the first incompatibility encountered with an exception. The problematic step (quantization or conversion or mapping) is indicated in the exception message. Then if errors occurs, issues must be fixed iteratively in order to obtain an Akida compatible model. Note that the version context is used to determine compatibility.
- Parameters:
model (
tf.keras.Model
oronnx.ModelProto
) – the model to check.device (
akida.HwDevice
, optional) – the device to map on. If a device is provided, there will be a check that the model can fully run on such device. Defaults to None.input_dtype (np.dtype or str, optional) – expected model input format. If given as a string, should follow numpy string type requirements. Defaults to ‘uint8’.
- Raises:
ValueError – if model type is incompatbile with Akida version context.
ValueError – if device is incompatibile with Akida version context.
Exception – if an incompatibility is encountered on quantization/conversion/mapping steps.
Legacy quantization API
While it is possible to quantize Akida 1.0 models using cnn2snn legacy quantization blocks, such usage is deprecated. You should rather use QuantizeML tool to quantize a model whenever possible.
Utils
A detailed description of the input_scaling parameter is given in the user guide.
- cnn2snn.compatibility_checks.check_model_compatibility(model)[source]
Checks if a Keras model is compatible for cnn2snn conversion.
This function doesn’t convert the Keras model to an Akida model but only checks if the model design is compatible.
Note that this function doesn’t check if the model is compatible with Akida hardware. To check compatibility with a specific hardware device, convert the model and call model.map with this device as argument.
1. How to build a compatible Keras quantized model?
The following lines give details and constraints on how to build a Keras model compatible for the conversion to an Akida model.
2. General information about layers
An Akida layer must be seen as a block of Keras layers starting with a processing layer (Conv2D, SeparableConv2D, Dense). All blocks of Keras layers except the last block must have exactly one activation layer (ReLU or ActivationDiscreteRelu). Other optional layers can be present in a block such as a pooling layer or a batch normalization layer. Here are all the supported Keras layers for an Akida-compatible model:
Processing layers:
tf.keras Conv2D/SeparableConv2D/Dense
cnn2snn QuantizedConv2D/QuantizedSeparableConv2D/QuantizedDense
Activation layers:
tf.keras ReLU
cnn2snn ActivationDiscreteRelu
any increasing activation function (only for the last block of layers) such as softmax, sigmoid set as last layer. This layer must derive from tf.keras.layers.Activation, and it will be removed during conversion to an Akida model.
Pooling layers:
MaxPool2D
GlobalAvgPool2D
BatchNormalization
Dropout
Flatten
Input
Reshape
Example of a block of Keras layers:
---------- | Conv2D | ---------- || \/ ---------------------- | BatchNormalization | ---------------------- || \/ ------------- | MaxPool2D | ------------- || \/ -------------------------- | ActivationDiscreteRelu | --------------------------
3. Constraints about inputs
An Akida model can accept two types of inputs: sparse events or 8-bit images. Whatever the input type, the Keras inputs must respect the following relation:
input_akida = scale * input_keras + shift
where the Akida inputs must be positive integers, the input scale must be a float value and the input shift must be an integer. In other words, scale * input_keras must be integers.
Depending on the input type:
if the inputs are events (sparse), the first layer of the Keras model can be any processing layer. The input shift must be zero.
if the inputs are images, the first layer must be a Conv2D layer.
4. Constraints about layers’ parameters
To be Akida-compatible, the Keras layers must observe the following rules:
all layers with the ‘data_format’ parameter must be ‘channels_last’
all processing quantized layers and ActivationDiscreteRelu must have a valid quantization bitwidth
a Dense layer must have an input shape of (N,) or (1, 1, N)
a BatchNormalization layer must have ‘axis’ set to -1 (default)
a BatchNormalization layer cannot have negative gammas
Reshape layers can only be used to transform a tensor of shape (N,) to a tensor of shape (1, 1, N), and vice-versa
only one pooling layer can be used in each block
a MaxPool2D layer must have the same ‘padding’ as the corresponding processing quantized layer
5. Constraints about the order of layers
To be Akida-compatible, the order of Keras layers must observe the following rules:
a block of Keras layers must start with a processing quantized layer
where present, a BatchNormalization/GlobalAvgPool2D layer must be placed before the activation
a Flatten layer can only be used before a Dense layer
an Activation layer other than ReLU can only be used in the last layer
- Parameters:
model (
tf.keras.Model
) – the model to check.
- cnn2snn.load_quantized_model(filepath, custom_objects=None, compile_model=True)[source]
Loads a quantized model saved in TF or HDF5 format.
If the model was compiled and trained before saving, its training state will be loaded as well. This function is a wrapper of tf.keras.models.load_model.
- Parameters:
filepath (string) – path to the saved model.
custom_objects (dict) – optional dictionary mapping names (strings) to custom classes or functions to be considered during deserialization.
compile_model (bool) – whether to compile the model after loading.
- Returns:
a Keras model instance.
- Return type:
keras.Model
Calibration
- cnn2snn.calibration.QuantizationSampler(model, samples, batch_size=None)[source]
A tool to inspect the layer outputs of a quantized model
The sampler is initialized with a quantized model and a set of samples used for the evaluation of the layer outputs. An optional batch size can be specified to avoid out-of-memory errors when using a GPU.
To evaluate the outputs of a specific layer, it must first be selected using the select_layer member.
Once done, three methods are available to inspect the layer outputs:
quantized_outputs returns the actual outputs of the layer,
float_outputs returns the outputs of the layer if its weights were not quantized,
quantization_error applies the keras.metrics.Metric passed as arguments to the difference between the float and quantized outputs.
Example
>>> # Evaluate the quantization MSE of a quantized layer >>> model = tf.keras.Sequential([ ... tf.keras.layers.Dense(5, input_shape=(3,)), ... tf.keras.layers.ReLU()]) >>> model_quantized = cnn2snn.quantize(model, ... weight_quantization=4, ... activ_quantization=4) >>> # Instantiate a QuantizationSampler with a few dataset samples >>> sampler = QuantizationSampler(model_quantized, samples) >>> # Select the quantized layer >>> sampler.select_layer(model_quantized.layers[0]) >>> # Evaluate the Mean Squared Error >>> m = keras.metrics.MeanSquaredError() >>> mse = sampler.quantization_error(m)
- Parameters:
model (
keras.Model
) – a quantized Keras modelsamples (
np.ndarray
) – a set of calibration samplesbatch_size (int) – the batch size used for evaluation.
- cnn2snn.calibration.bias_correction(model, samples, batch_size=None)[source]
Apply a corrective bias to quantized layers.
This implements the Bias Correction algorithm described in: Data-Free Quantization Through Weight Equalization and Bias Correction Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling https://arxiv.org/abs/1906.04721
It is empirically demonstrated in the original paper that the weight quantization can introduce a biased error in the activations that is quite significant for low bitwidth weights (i.e. lower than 8-bit). This algorithm simply estimates the quantization bias on a set of samples, and subtracts it from the layer bias variable.
If the accuracy of the quantized model suffers a huge drop as compared to the original model, this simple correction can recover the largest part of the drop, but not all of it.
When optimizing a model, nothing is required but a set of samples for calibration (typically from the training dataset). Depending on the model and dataset, your mileage may vary, but it has been observed empirically that there is no significant difference between the models fixed with a very few samples (16) and those fixed with a higher number of samples (1024).
- Parameters:
model (
keras.Model
) – a quantized Keras Modelsamples (
np.ndarray
) – a set of samples used for calibrationbatch_size (int) – the batch size used when evaluating samples
- Returns:
a quantized Keras model whose biases have been corrected
- Return type:
keras.Model
- cnn2snn.calibration.adaround(model, samples, optimizer, epochs, loss=<keras.src.losses.MeanSquaredError object>, batch_size=None, include_activation=False)[source]
Optimize the rounding of quantized weights.
This implements the Adaround algorithm described in: Up or Down? Adaptive Rounding for Post-Training Quantization Markus Nagel, Rana Ali Amjad, Mart van Baalen, Christos Louizos, Tijmen Blankevoort https://arxiv.org/abs/2004.10568
Instead of rounding weights to the nearest, Adaround introduces a tensor of continuous variables representing the decimals of the float weights, and thus formulates the minimization of the quantization error as a Quadratic Unconstrained Binary Optimization problem, iteratively pushing the decimal variables to a distribution of 0 and 1 minimizing the error.
After the optimization, the quantization scales are preserved, but each weight is closer or equal to a quantized value.
When optimizing a model, the following must be provided:
a set of samples (typically from the training dataset),
an optimizer,
the maximum number of epochs (the optimization of a layer stops when all weights have been rounded).
- Parameters:
model (
keras.Model
) – a quantized Keras Modelsamples (
np.ndarray
) – a set of samples used for calibrationoptimizer (
tensorflow.keras.optimizers.Optimizer
) – an optimizerepochs (int) – the maximum number of epochs
loss (
tensorflow.keras.losses.Loss
) – the error loss functionbatch_size (int) – the batch size used when evaluating samples
include_activation (bool) – quantization error is evaluated after activation.
- Returns:
a quantized Keras model whose weights have been optimized
- Return type:
tf.Model
Transforms
- cnn2snn.transforms.sequentialize(model)[source]
Transform a Model into Sequential sub-models and Concatenate layers.
This function returns an equivalent model where all linear branches are replaced by a Sequential sub-model.
- Parameters:
model (
tf.keras.Model
) – a Keras model- Returns:
a Keras model with Sequential sub-models
- Return type:
tf.keras.Model
- cnn2snn.transforms.syncretize(model)[source]
Align all linear branches of a Model with akida layer sequences.
The input model must be composed of Sequential submodels and Concatenate layers. This function will apply transformations on every Sequential submodel and returns an equivalent functional model with akida compatible sequences of layers.
- Parameters:
model (
tf.keras.Model
) – a Keras model with Sequential submodels.- Returns:
a Keras model with akida-compatible Sequential submodels.
- Return type:
tf.keras.Model
- cnn2snn.transforms.invert_batchnorm_pooling(model)[source]
Inverts pooling and BatchNormalization layers in a Sequential model to have BN layer before pooling.
Having pool->BN or BN->pool is equivalent only if BN layer has no negative gammas.
- Parameters:
model (
tf.keras.Model
) – a Sequential Keras model.- Returns:
a Sequential Keras model.
- Return type:
tf.keras.Model
- cnn2snn.transforms.fold_batchnorm(model)[source]
Folds BatchNormalization layers into the preceding neural layers of a Sequential model.
- Parameters:
model (
tf.keras.Model
) – a Sequential Keras model.- Returns:
a Sequential Keras model.
- Return type:
tf.keras.Model
- cnn2snn.transforms.weights_homogeneity(model)[source]
Give an estimation of the homogeneity of layer weights
For each Conv or Dense layer in the model, this compares the ranges of the weights for each filter with the range of the tensor. The score for each filter is expressed as an homogeneity rate (1 is the maximum), and the layer homogeneity rate is the mean of all filter rates.
- Parameters:
model (
tf.keras.Model
) – a Keras model.- Returns:
rates indexed by layer names.
- Return type:
dict
- cnn2snn.transforms.normalize_separable_layer(layer)[source]
This normalizes the depthwise weights of a SeparableConv2D.
In order to limit the quantization error when using a per-tensor quantization of depthwise weights, this rescales all depthwise weights to fit within the [-1, 1] range. To preserve the output of the layer, each depthwise kernel is rescaled independently to the [-1, 1] interval by dividing all weights by the absolute maximum value, and inversely, all pointwise filters ‘looking’ at these kernels are multiplied by the same value.
- Parameters:
layer (
tf.keras.layers.SeparableConv2D
) – a Keras SeparableConv2D layer.
- cnn2snn.transforms.normalize_separable_model(model)[source]
This normalizes the depthwise weights of all SeparableConv2D in a Model.
- Parameters:
model (
tf.keras.Model
) – a Keras model.- Returns:
a new Keras model with normalized depthwise weights in SeparableConv2D layers.
- Return type:
tf.keras.Model
- cnn2snn.transforms.reshape(model_keras, input_x, input_y)[source]
Rescales the model by changing its input size.
- Parameters:
model_keras (
tf.keras.Model
) – Keras model to rescaleinput_x (int) – desired model input first dimension
input_y (int) – desired model input second dimension
- Returns:
the rescaled model
- Return type:
keras.Model
Constraint
Quantization
- cnn2snn.quantize(model, weight_quantization=0, activ_quantization=0, input_weight_quantization=None, fold_BN=True, quantizer_function=None)[source]
Converts a standard sequential Keras model to a CNN2SNN Keras quantized model, compatible for Akida conversion.
This function returns a Keras model where the standard neural layers (Conv2D, SeparableConv2D, Dense) and the ReLU activations are replaced with CNN2SNN quantized layers (QuantizedConv2D, QuantizedSeparableConv2D, QuantizedDense, QuantizedRelu).
Several transformations are applied to the model: - the order of MaxPool and BatchNormalization layers are inverted so that BatchNormalization always happens first, - the batch normalization layers are folded into the previous layers.
This new model can be either directly converted to akida, or first retrained for a few epochs to recover any accuracy loss.
- Parameters:
model (tf.keras.Model) – a standard Keras model
weight_quantization (int) –
sets all weights in the model to have a particular quantization bitwidth except for the weights in the first layer.
’0’ implements floating point 32-bit weights.
’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
activ_quantization (int) –
sets all activations in the model to have a particular activation quantization bitwidth.
’0’ implements floating point 32-bit activations.
’1’ through ‘8’ implements n-bit weights where n is from 1-8 bits.
input_weight_quantization (int) –
sets weight quantization in the first layer. Defaults to weight_quantization value.
’None’ implements the same bitwidth as the other weights.
’0’ implements floating point 32-bit weights.
’2’ through ‘8’ implements n-bit weights where n is from 2-8 bits.
fold_BN (bool) – enable folding batch normalization layers with their corresponding neural layer.
quantizer_function (function) – callable that takes as argument the layer instance to be quantized and the corresponding default quantizer and returns the quantizer to use.
- Returns:
a quantized Keras model
- Return type:
tf.keras.Model
- cnn2snn.quantize_layer(model, target_layer, bitwidth, quantizer_function=None)[source]
Quantizes a specific layer with the given bitwidth.
This function returns a Keras model where the target layer is quantized. All other layers are preserved. If the target layer is a native Keras layer (Conv2D, SeparableConv2D, Dense, ReLU), it is replaced by a CNN2SNN quantized layer (QuantizedConv2D, QuantizedSeparableConv2D, QuantizedDense, ActivationDiscreteRelu). If the target layer is an already quantized layer, only the bitwidth is modified.
Examples
>>> # Quantize a layer of a native Keras model >>> model = tf.keras.Sequential([ ... tf.keras.layers.Dense(5, input_shape=(3,)), ... tf.keras.layers.Softmax()]) >>> model_quantized = cnn2snn.quantize_layer(model, ... target_layer=0, ... bitwidth=4) >>> assert isinstance(model_quantized.layers[0], cnn2snn.QuantizedDense) >>> print(model_quantized.layers[0].quantizer.bitwidth) 4
>>> # Quantize a layer of an an already quantized layer >>> model_quantized = cnn2snn.quantize_layer(model_quantized, ... target_layer=0, bitwidth=2) >>> print(model_quantized.layers[0].quantizer.bitwidth) 2
- Parameters:
model (tf.keras.Model) – a standard Keras model
target_layer – a standard or quantized Keras layer to be converted, or the index or name of the target layer.
bitwidth (int) – the desired quantization bitwidth. If zero, no quantization will be applied.
quantizer_function (function) – callable that takes as argument the layer instance to be quantized and the corresponding default quantizer and returns the quantizer to use.
- Returns:
a quantized Keras model
- Return type:
tf.keras.Model
- Raises:
ValueError – In case of invalid target layer
ValueError – If bitwidth is not greater than zero
Quantizers
WeightQuantizer
- class cnn2snn.quantization_ops.WeightQuantizer(*args, **kwargs)[source]
Bases:
Layer
The base class for all weight quantizers.
This base class must be overloaded as well as the two functions quantize and scale_factor.
Quantizers derived from this class must be symmetric uniform mid-tread quantizers, in order to be compatible with the conversion into an Akida model. Quantization is usually done in two steps:
The weights must be first quantized on integer values in the range imposed by the bitwidth, e.g. from -7 to 7 for a 4-bit quantization.
These integer weights are then reconstructed to float discretized values, in the range of the original weights. For example, 4-bit integer weights are reconstructed on a grid from -7*qstep to 7*qstep, where qstep is the quantization step size between two levels of the uniform grid.
For a full explanation about mid-tread uniform quantization, one can take a look at the Wikipedia page.
The quantize function takes as inputs the original weights and must return the reconstructed float values after quantization. The scale_factor function must return the factor used to transform the float reconstructed weights into the integer values obtained after step 1. In other words, given a set of float weights “w”:
quantize(w) * scale_factor(w) is a set of integer weights.
The bitwidth defines the number of quantization levels on which the weights will be quantized. For instance, a 4-bit quantization gives integer values between -7 and 7. More generally, for a n-bit quantization, values are ranged from -kmax to kmax where kmax is (2^(n-1) - 1).
- Parameters:
bitwidth (int) – the quantization bitwidth.
Attributes:
Returns the bitwidth of the quantizer
Methods:
Returns the config of the layer.
quantize
(w)Quantizes the specified weights Tensor.
scale_factor
(w)Evaluates the scale factor for the specified weights tf.Tensor.
- property bitwidth
Returns the bitwidth of the quantizer
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
- quantize(w)[source]
Quantizes the specified weights Tensor.
This function must return a tf.Tensor containing float weights discretized on a uniform grid based on the scale factor “sf”. In other words, the discretized weights must be values among: -kmax/sf, …, -2/sf, -1/sf, 0, 1/sf, 2/sf, …, kmax/sf
- Parameters:
w (
tensorflow.Tensor
) – the weights Tensor to quantize.- Returns:
a Tensor of quantized weights.
- Return type:
tensorflow.Tensor
- scale_factor(w)[source]
Evaluates the scale factor for the specified weights tf.Tensor.
This function returns the scale factor to get the quantized integer weights from the reconstructed float weights. It is equal to the inverse of the quantization step size.
The scale factor can be a scalar that is applied on the whole tensor of weights. It can also be a vector of length the number of filters, where each value applies to the weights of the corresponding output filter. This is called a per-axis quantization, as opposed to a per-tensor quantization. The number of filters is usually the last dimension of the weights tensor. More details are given here
Note that the quantizer_dw of a depthwise convolution in a QuantizedSeparableConv2D layer must imperatively return a scalar scale factor.
- Parameters:
w (
tensorflow.Tensor
) – the weights Tensor to quantize.- Returns:
a Tensor containing a list of scalar values (1 or more).
- Return type:
tensorflow.Tensor
LinearWeightQuantizer
- class cnn2snn.quantization_ops.LinearWeightQuantizer(*args, **kwargs)[source]
Bases:
WeightQuantizer
An abstract linear weight quantizer
This abstract class proposes a linear symmetric and uniform quantization function. The “linear” term here means that there is no non-linear transformation of the weights before the uniform quantization.
The scale_factor function must be overloaded.
Methods:
quantize
(w)Linearly quantizes the input weights on a symmetric uniform grid based on the scale factor.
scale_factor
(w)Evaluates the scale factor for the specified weights tf.Tensor.
- quantize(w)[source]
Linearly quantizes the input weights on a symmetric uniform grid based on the scale factor.
The input weights are directly rounded to the closest discretized value, without any transformation on the input weights.
The gradient is estimated using the Straight-Through Estimator (STE), i.e. the gradient is computed as if there were no quantization.
- scale_factor(w)[source]
Evaluates the scale factor for the specified weights tf.Tensor.
This function returns the scale factor to get the quantized integer weights from the reconstructed float weights. It is equal to the inverse of the quantization step size.
The scale factor can be a scalar that is applied on the whole tensor of weights. It can also be a vector of length the number of filters, where each value applies to the weights of the corresponding output filter. This is called a per-axis quantization, as opposed to a per-tensor quantization. The number of filters is usually the last dimension of the weights tensor. More details are given here
Note that the quantizer_dw of a depthwise convolution in a QuantizedSeparableConv2D layer must imperatively return a scalar scale factor.
- Parameters:
w (
tensorflow.Tensor
) – the weights Tensor to quantize.- Returns:
a Tensor containing a list of scalar values (1 or more).
- Return type:
tensorflow.Tensor
StdWeightQuantizer
- class cnn2snn.StdWeightQuantizer(*args, **kwargs)[source]
Bases:
LinearWeightQuantizer
A uniform quantizer based on weights standard deviation.
Quantizes the specified weights into 2^bitwidth-1 values centered on zero. E.g. with bitwidth = 4, 15 quantization levels: from -7 * qstep to 7 * qstep with qstep being the quantization step. The quantization step is defined by:
qstep = threshold * std(W) / max_value
with max_value being 2^(bitwidth-1) - 1. E.g with bitwidth = 4, max_value = 7.
All values below or above threshold * std(W) are automatically assigned to the min (resp max) value.
- Parameters:
threshold (int) – the standard deviation multiplier used to exclude outliers.
bitwidth (int) – the quantizer bitwidth defining the number of quantized values.
Methods:
Returns the config of the layer.
scale_factor
(w)Evaluates the scale factor for the specified weights tf.Tensor.
sigma_
(w)Returns the standard deviation(s) of a set of weights
Attributes:
Returns the threshold of the std quantizer
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
- scale_factor(w)[source]
Evaluates the scale factor for the specified weights tf.Tensor.
This function returns the scale factor to get the quantized integer weights from the reconstructed float weights. It is equal to the inverse of the quantization step size.
The scale factor can be a scalar that is applied on the whole tensor of weights. It can also be a vector of length the number of filters, where each value applies to the weights of the corresponding output filter. This is called a per-axis quantization, as opposed to a per-tensor quantization. The number of filters is usually the last dimension of the weights tensor. More details are given here
Note that the quantizer_dw of a depthwise convolution in a QuantizedSeparableConv2D layer must imperatively return a scalar scale factor.
- Parameters:
w (
tensorflow.Tensor
) – the weights Tensor to quantize.- Returns:
a Tensor containing a list of scalar values (1 or more).
- Return type:
tensorflow.Tensor
- property threshold
Returns the threshold of the std quantizer
StdPerAxisQuantizer
- class cnn2snn.StdPerAxisQuantizer(*args, **kwargs)[source]
Bases:
StdWeightQuantizer
A quantizer that relies on weights standard deviation per axis.
Quantizes the specified weights into 2^bitwidth-1 values centered on zero. E.g. with bitwidth = 4, 15 quantization levels: from -7 * qstep to 7 * qstep with qstep being the quantization step. The quantization step is defined by:
qstep = max_range / max_value
with:
max_range = max(abs(W))
max_value = 2^(bitwidth-1) - 1. E.g with bitwidth = 4, max_value = 7.
This is an evolution of the StdWeightQuantizer that defines the weights range per axis.
The last dimension is used as axis, meaning that the scaling factor is a vector with as many values as “filters”, or “neurons”.
Note: for a DepthwiseConv2D layer that has a single filter, this quantizer is strictly equivalent to the StdWeightQuantizer.
Methods:
sigma_
(w)Returns the standard deviation(s) of a set of weights
MaxQuantizer
- class cnn2snn.MaxQuantizer(*args, **kwargs)[source]
Bases:
LinearWeightQuantizer
A quantizer that relies on maximum range.
Quantizes the specified weights into 2^bitwidth-1 values centered on zero. E.g. with bitwidth = 4, 15 quantization levels: from -7 * qstep to 7 * qstep with qstep being the quantization step. The quantization step is defined by:
qstep = max_range / max_value
with:
max_range = max(abs(W))
max_value = 2^(bitwidth-1) - 1. E.g with bitwidth = 4, max_value = 7.
- Parameters:
bitwidth (int) – the quantizer bitwidth defining the number of quantized values.
Methods:
max_range_
(w)Get the range on which the weights are quantized.
scale_factor
(w)Evaluates the scale factor for the specified weights tf.Tensor.
- static max_range_(w)[source]
Get the range on which the weights are quantized. This quantizer discretizes weights in the range:
[-max(weights) ; max(weights)]
- scale_factor(w)[source]
Evaluates the scale factor for the specified weights tf.Tensor.
This function returns the scale factor to get the quantized integer weights from the reconstructed float weights. It is equal to the inverse of the quantization step size.
The scale factor can be a scalar that is applied on the whole tensor of weights. It can also be a vector of length the number of filters, where each value applies to the weights of the corresponding output filter. This is called a per-axis quantization, as opposed to a per-tensor quantization. The number of filters is usually the last dimension of the weights tensor. More details are given here
Note that the quantizer_dw of a depthwise convolution in a QuantizedSeparableConv2D layer must imperatively return a scalar scale factor.
- Parameters:
w (
tensorflow.Tensor
) – the weights Tensor to quantize.- Returns:
a Tensor containing a list of scalar values (1 or more).
- Return type:
tensorflow.Tensor
MaxPerAxisQuantizer
- class cnn2snn.MaxPerAxisQuantizer(*args, **kwargs)[source]
Bases:
MaxQuantizer
A quantizer that relies on maximum range per axis.
Quantizes the specified weights into 2^bitwidth-1 values centered on zero. E.g. with bitwidth = 4, 15 quantization levels: from -7 * qstep to 7 * qstep with qstep being the quantization step. The quantization step is defined by:
qstep = max_range / max_value
with:
max_range = max(abs(W))
max_value = 2^(bitwidth-1) - 1. E.g with bitwidth = 4, max_value = 7.
This is an evolution of the MaxQuantizer that defines the max_range per axis.
The last dimension is used as axis, meaning that the scaling factor is a vector with as many values as “filters”, or “neurons”.
Note: for a DepthwiseConv2D layer that has a single filter, this quantizer is strictly equivalent to the MaxQuantizer.
Methods:
max_range_
(w)Get the range on which the weights are quantized.
Quantized layers
QuantizedConv2D
- class cnn2snn.QuantizedConv2D(*args, **kwargs)[source]
Bases:
Conv2D
A quantization-aware Keras convolutional layer.
Inherits from Keras Conv2D layer, applying a quantization on weights during the forward pass.
- Parameters:
filters (int) – the number of filters.
kernel_size (tuple of integer) – the kernel spatial dimensions.
quantizer (
cnn2snn.WeightQuantizer
) – the quantizer to apply during the forward pass.strides (integer, or tuple of integers, optional) – strides of the convolution along spatial dimensions.
padding (str, optional) – one of ‘valid’ or ‘same’.
use_bias (boolean, optional) – whether the layer uses a bias vector.
kernel_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the weights matrix.bias_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the bias vector.kernel_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the weights.bias_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the bias.activity_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the output of the layer.kernel_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the weights.bias_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the bias.
Methods:
call
(inputs)Evaluates input Tensor.
Returns the config of the layer.
- call(inputs)[source]
Evaluates input Tensor.
This applies the quantization on weights, then evaluates the input Tensor and produces the output Tensor.
- Parameters:
inputs (
tensorflow.Tensor
) – input Tensor.- Returns:
output Tensor.
- Return type:
tensorflow.Tensor
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
QuantizedDense
- class cnn2snn.QuantizedDense(*args, **kwargs)[source]
Bases:
Dense
A quantization-aware Keras dense layer.
Inherits from Keras Dense layer, applying a quantization on weights during the forward pass.
- Parameters:
units (int) – the number of neurons.
use_bias (boolean, optional) – whether the layer uses a bias vector.
quantizer (
cnn2snn.WeightQuantizer
) – the quantizer to apply during the forward pass.kernel_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the weights matrix.bias_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the bias vector.kernel_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the weights.bias_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the bias.activity_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the output of the layer.kernel_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the weights.bias_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the bias.
Methods:
call
(inputs)Evaluates input Tensor.
Returns the config of the layer.
- call(inputs)[source]
Evaluates input Tensor.
This applies the quantization on weights, then evaluates the input Tensor and produces the output Tensor.
- Parameters:
inputs (
tensorflow.Tensor
) – input Tensor.- Returns:
output Tensor.
- Return type:
tensorflow.Tensor
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
QuantizedSeparableConv2D
- class cnn2snn.QuantizedSeparableConv2D(*args, **kwargs)[source]
Bases:
SeparableConv2D
A quantization-aware Keras separable convolutional layer.
Inherits from Keras SeparableConv2D layer, applying a quantization on weights during the forward pass.
Creates a quantization-aware separable convolutional layer.
- Parameters:
filters (int) – the number of filters.
kernel_size (tuple of integer) – the kernel spatial dimensions.
quantizer (
cnn2snn.WeightQuantizer
) – the quantizer to apply during the forward pass.quantizer_dw (
cnn2snn.WeightQuantizer
, optional) – the depthwise quantizer to apply during the forward pass.strides (integer, or tuple of integers, optional) – strides of the convolution along spatial dimensions.
padding (str, optional) – One of ‘valid’ or ‘same’.
use_bias (boolean, optional) – Whether the layer uses a bias vector.
depthwise_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the depthwise kernel.pointwise_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the pointwise kernel.bias_initializer (str, or a
tf.keras.initializer
, optional) – initializer for the bias vector.depthwise_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the depthwise kernel.pointwise_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the pointwise kernel.bias_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the bias.activity_regularizer (str, or a
tf.keras.regularizer
, optional) – regularization applied to the output of the layer.depthwise_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the depthwise kernel.pointwise_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the pointwise kernel.bias_constraint (str, or a
tf.keras.constraint
, optional) – constraint applied to the bias.
Methods:
call
(inputs)Evaluates input Tensor.
Returns the config of the layer.
- call(inputs)[source]
Evaluates input Tensor.
This applies the quantization on weights, then evaluates the input Tensor and produces the output Tensor.
- Parameters:
inputs (
tensorflow.Tensor
) – input Tensor.- Returns:
a Tensor.
- Return type:
tensorflow.Tensor
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
QuantizedActivation
- class cnn2snn.QuantizedActivation(*args, **kwargs)[source]
Bases:
Layer
Base class for quantized activation layers.
This base class must be overloaded as well as the step @property function.
This @property function must return a TensorFlow object (e.g. tf.Tensor or tf.Variable) of scalar values. The .numpy() method must be callable on them. They can be fixed at initialization or can be trainable variables.
The CNN2SNN toolkit only support linear quantized activation as defined in the quantized_activation function.
The bitwidth defines the number of quantization levels on which the activation will be quantized. For instance, a 4-bit quantization gives 15 activation levels. More generally, a n-bit quantization gives 2^n-1 levels.
- Parameters:
bitwidth (int) – the quantization bitwidth
Attributes:
Returns the bitwidth of the quantized activation
Returns the interval between two quantized activation values
The quantization threshold is equal to half the quantization step to better approximate the ReLU.
Methods:
call
(inputs, *args, **kwargs)Evaluates the quantized activations for the specified input Tensor.
Returns the config of the layer.
Evaluates the quantized activations for the specified input Tensor.
- property bitwidth
Returns the bitwidth of the quantized activation
- call(inputs, *args, **kwargs)[source]
Evaluates the quantized activations for the specified input Tensor.
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
- quantized_activation(x)[source]
Evaluates the quantized activations for the specified input Tensor.
Activations will be clipped to a quantization range, and quantized to a number of values defined by the bitwidth: N = (2^bitwidth - 1) values plus zero.
The quantization is defined by a single step parameter, that defines the interval between two quantized values.
A quantization threshold set to half the quantization step is used to evaluate the quantization intervals, to make sure that each quantized value is exactly in the middle of its quantization interval, thus minimizing the quantization error.
For any potential x, the activation output is as follows:
if x <= threshold, activation is zero
if threshold + (n - 1) * step < x <= threshold + n * step, activation is n * step
if x > threshold + levels * step, activation is levels * step
- Parameters:
x (
tensorflow.Tensor
) – the input values.
- property step
Returns the interval between two quantized activation values
- property threshold
The quantization threshold is equal to half the quantization step to better approximate the ReLU.
ActivationDiscreteRelu
- class cnn2snn.ActivationDiscreteRelu(*args, **kwargs)[source]
Bases:
QuantizedActivation
A discrete ReLU Keras Activation.
For bitwidth 1 or 2:
threshold is 0.5 and step is 1
For bithwidth > 2, with N = 2^bitwidth - 1:
threshold is 3 / N and step is 6 / N
- Parameters:
bitwidth (int) – the activation bitwidth.
Attributes:
Returns the interval between two quantized activation values
- property step
Returns the interval between two quantized activation values
QuantizedReLU
- class cnn2snn.QuantizedReLU(*args, **kwargs)[source]
Bases:
QuantizedActivation
A configurable Quantized ReLU Keras Activation.
In addition to the quantization bitwidth, this class can be initialized with a max_value parameter corresponding to the ReLU maximum value.
- Parameters:
bitwidth (int) – the activation bitwidth.
max_value (float) – the initial max_value
Methods:
Returns the config of the layer.
Attributes:
Returns the interval between two quantized activation values
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns:
Python dictionary.
- property step
Returns the interval between two quantized activation values