1 Scope
This document specifies Neural Network Coding (NNC) as a compressed representation
of the parameters/weights of a trained neural network and a decoding process for the
compressed representation, complementing the description of the network topology in
existing (exchange) formats for neural networks. It establishes a toolbox of compression
methods, specifying (where applicable) the resulting elements of the compressed bitstream.
Most of these tools can be applied to the compression of entire neural networks, and
some of them can also be applied to the compression of differential updates of neural
networks with respect to a base network. Such differential updates are for example
useful when models are redistributed after fine-tuning or transfer learning, or when
providing versions of a neural network with different compression ratios.
This document does not specify a complete protocol for the transmission of neural
networks, but focuses on compression of network parameters. Only the syntax format,
semantics, associated decoding process requirements, parameter sparsification, parameter
transformation methods, parameter quantization, entropy coding method and integration/signalling
within existing exchange formats are specified, while other matters such as pre-processing,
system signalling and multiplexing, data loss recovery and post-processing are considered
to be outside the scope of this document. Additionally, the internal processing steps
performed within a decoder are also considered to be outside the scope of this document;
only the externally observable output behaviour is required to conform to the specifications
of this document.