#### Warning

Please note that CNN2SNN quantization is now deprecated and shouldn't be used anymore.\n [QuantizeML](../../user_guide/quantizeml.html#)_ tool replaces it.\n However, we wanted to keep some CNN2SNN quantization\n examples of use, to avoid Akida 1.0 IP based hardware support discontinuity.

\n\nThis tutorial gives insights about CNN2SNN for users who want to go deeper\ninto the quantization possibilities of Keras models. We recommend first looking\nat the [user guide](../../user_guide/cnn2snn.html#legacy-quantization-api)_ to get started with\nCNN2SNN.\n\nThe CNN2SNN toolkit offers an easy-to-use set of functions to get a quantized\nmodel from a native Keras model and to convert it to an Akida model compatible\nwith the Akida NSoC. The [quantize](../../api_reference/cnn2snn_apis.html#cnn2snn.quantize)_\nand [quantize_layer](../../api_reference/cnn2snn_apis.html#cnn2snn.quantize_layer)_\nhigh-level functions replace native Keras layers into custom CNN2SNN quantized\nlayers which are derived from their Keras equivalents. However, these functions\nare not designed to choose how the weights and activations are quantized. This\ntutorial will present an alternative low-level method to define models with\ncustomizable quantization of weights and activations.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Design a CNN2SNN quantized model\n\nUnlike the standard CNN2SNN flow where a native Keras model is quantized\nusing the ``quantize`` and ``quantize_layer`` functions, a customizable\nquantized model must be directly created using quantized layers.\n\nThe CNN2SNN toolkit supplies custom quantized layers to replace native\nKeras neural layers (Conv2D, SeparableConv2D and Dense) and\nactivations (ReLU).\n\n### Quantized neural layers\n\nThe CNN2SNN quantized neural layers are:\n\n* **QuantizedConv2D**, derived from ``keras.Conv2D``\n* **QuantizedSeparableConv2D**, derived from ``keras.SeparableConv2D``\n* **QuantizedDense**, derived from ``keras.Dense``\n\nThey are similar to their Keras counterparts, but have an additional\nargument: ``quantizer``. This parameter expects a *WeightQuantizer* object\nthat defines how the weights are discretized for a given bitwidth. Some\nquantizers are proposed in the CNN2SNN API:\n\n* **StdWeightQuantizer** and **StdPerAxisQuantizer**: these two quantizers use\n the standard deviation of the weights to compute the range on which weights\n are discretized. The *StdWeightQuantizer* uses a range equal to a fixed\n number of standard deviations to discretize all weights within a layer,\n whereas the *StdPerAxisQuantizer* discretizes each feature kernel\n independently.\n* **MaxQuantizer** and **MaxPerAxisQuantizer**: these discretize on\n a range based on the maximum of the absolute value of the weights. The\n *MaxQuantizer* discretizes all weights within a layer based on their global\n maximum, whereas the *MaxPerAxisQuantizer* discretizes each feature kernel,\n in practice the last dimension of the weights tensor, independently based\n on its local maximum.\n\nIf those quantizers do not fit your specific needs, you can\ncreate your own (cf. `weight-quantizer-section`).\n\n.. Note:: The `QuantizedSeparableConv2D` layer can accept two quantizers:\n one ``quantizer`` for the pointwise convolution and a\n ``quantizer_dw`` for the depthwise convolution. If the latter is\n not defined, it is set by default to the same value as\n ``quantizer``.\n\n For Akida compatibility, the depthwise quantizer must be a\n per-tensor quantizer (i.e. all weights within the depthwise kernel\n are quantized together) and not a per-axis quantizer (i.e. each\n feature kernel is quantized independently). See more details\n [here](https://www.tensorflow.org/lite/performance/quantization_spec#per-axis_vs_per-tensor)_.\n\n\n### Quantized activation layers\n\nSimilarly, a quantized activation layer returns values that are discretized\non a uniform grid. Two quantized activation layers are provided to replace\nthe native ReLU layers:\n\n* **ActivationDiscreteRelu**: a linear quantizer for ReLU, clipped at value 6.\n* **QuantizedRelu**: a configurable activation layer where max clipping value\n is a parameter.\n\nIt is also possible to define a custom quantized activation layer. Details\nare given in the section `activation-section`.\n\n.. Note:: The ``quantize`` function is a high-level helper that automatically\n replaces the neural layers with their corresponding quantized\n counterparts, using\n [MaxPerAxisQuantizer](../../api_reference/cnn2snn_apis.html#maxperaxisquantizer)_.\n The ReLU layers are substituted by\n [QuantizedRelu](../../api_reference/cnn2snn_apis.html#cnn2snn.QuantizedReLU)_\n layers.\n\n\n### Create a quantized model\n\nHere, we illustrate how to create a quantized model, equivalent to a native\nKeras model. We use the weight quantizers\nand quantized activation layers available in the CNN2SNN package. Although\nwe present only one weight quantizer and one quantized activation, a quantized\nmodel can be a mix of any quantizers and activations. For instance, every\nneural layer can have a different weight quantizer with different parameters.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from tensorflow.keras import Sequential, Input, layers\n\n# Create a native Keras toy model\nmodel_keras = Sequential([\n\n # Input layer\n Input(shape=(28, 28, 1)),\n\n # Conv + MaxPool + BatchNorm + ReLU\n layers.Conv2D(8, 3),\n layers.MaxPool2D(),\n layers.BatchNormalization(),\n layers.ReLU(),\n\n # Flatten + Dense + Softmax\n layers.Flatten(),\n layers.Dense(10),\n layers.Softmax()\n])\n\nmodel_keras.summary()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from cnn2snn import quantization_layers as qlayers\nfrom cnn2snn import quantization_ops as qops\n\n# Prepare weight quantizers\nq1 = qops.MaxQuantizer(bitwidth=8)\nq2 = qops.MaxQuantizer(bitwidth=4)\n\n# Get layer names to set them in the quantized model\nnames = [layer.name for layer in model_keras.layers]\n\n# Create a quantized model, equivalent to the native Keras model\nmodel_quantized = Sequential([\n\n # Input layer\n Input(shape=(28, 28, 1)),\n\n # Conv + MaxPool + BatchNorm + ReLU\n qlayers.QuantizedConv2D(8, 3, quantizer=q1, name=names[0]),\n layers.MaxPool2D(name=names[1]),\n layers.BatchNormalization(name=names[2]),\n qlayers.QuantizedReLU(bitwidth=4, name=names[3]),\n\n # Flatten + Dense + Softmax\n layers.Flatten(name=names[4]),\n qlayers.QuantizedDense(10, quantizer=q2, name=names[5]),\n layers.Softmax(name=names[6]),\n])\n\nmodel_quantized.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n## 2. Weight Quantizer Details\n\n### How a weight quantizer works\n\nThe purpose of a weight quantizer is to compute a tensor of discretized\nweights. It can be split into two operations:\n\n- an optional transformation applied on the weights, e.g. a shift, a\n non-linear transformation, ...\n- the quantization of the weights.\n\nFor Akida compatibility, the weights must be discretized on a symmetric grid\ndefined by two parameters:\n\n- the **bitwidth** defines the number of unique values the weights can take.\n We define *kmax = 2^(bitwidth-1) - 1*, being the maximum integer value of\n the symmetric quantization scheme. For instance, a 4-bit quantizer must\n return weights on a grid of 15 values, between -7 and 7. Here, *kmax = 7*.\n- the symmetric range on which the weights will be discretized (let's say\n between *-lim* and *lim*). Instead of working with the range, we use the\n **scale factor** which is defined by *sf = kmax / lim*, where *sf* is the\n scale factor. For instance with a 4-bit quantizer, the discretized weights\n will be on the grid [*-7/sf, -6/sf, ..., -1/sf, 0, 1/sf, ..., 6/sf, 7/sf*].\n The maximum discrete value *7/sf* is equal to *lim*, the limit of the range\n (see figure below).\n\n