In a new paper, University of Oxford researchers introduce a novel image compression approach that outperforms the JPEG standard at low bitrates, even without entropy coding or learning a distribution over weights.
Recently developed autoencoder methods for lossy image compression have attracted attention in the machine learning and image processing communities. Such autoencoders operate on a simple principle: an image, typically modelled as a vector of pixel intensities, is quantized, thus reducing the amount of information required to store or transmit it.
Instead of storing the RGB values for each pixel of an image, the proposed approach stores the weights of a neural network overfitted to the image. The researchers call their method ‘COIN’ (COmpressed Implicit Neural representations).
COIN encodes an image by overfitting it with a small multilayer perceptron (MLP) — a type of feedforward artificial neural network. The approach maps pixel locations to RGB values (often referred to as an implicit neural representation), then transmits the weights of this MLP. At decoding time, the transmitted MLP is evaluated at all pixel locations to reconstruct the image.
The most difficult part of the entire image compression process is the overfitting of MLPs, due to the high-frequency information within natural images. Recent approaches have employed sinusoidal encodings and activations to mitigate this process, and the new study discovers that using MLPs with sine activations can fit large images (393k pixels) with surprisingly small networks (8k parameters).
To reduce model size, the researchers apply architecture search and weight quantization. They perform a hyperparameter sweep over the width and number of MLP layers, quantizing the weights from 32-bit to 16-bit precision, which is enough to beat the JPEG standard for low bitrates.
The other highlight of COIN is the extra flexibility of its decoding approach. Images can be progressively decoded simply by evaluating functions at various pixel locations. Partially decoding images in this way is difficult for previous autoencoder-based methods.
To test COIN’s performance, the researchers performed experiments on the Kodak image dataset comprising 24 images of size 768×512. The model was compared with three autoencoder-based neural compression models (BMS, MBT and CST). They also compared COIN against the JPEG, JPEG2000, BPG and VTM image codecs.
The researchers first identified valid combinations of depth and width for the MLPs representing an image (e.g. for 0.3bpp using 16-bit weights) to determine the best model architectures for a given parameter budget (measured as bits per pixel, or bpp). The results show that at low bitrates the proposed model improves on the JPEG standard even without the use of entropy coding.
The experiments demonstrate that COIN outperforms the JPEG standard after 15k iterations and then continues to improve steadily; and that compression quality depends on the architecture choice, with different optimal architectures for different bpp values. The Oxford team says it hopes further work in this area will lead to a novel class of methods for neural data compression.
The paper COIN: Compression with Implicit Neural representations is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.