Scalable image coding based on epitomes

contact: M. Alain, C. Guillemot, D. Thoreau, P. Guillotel

Context and Goal

The concept of epitome was first introduced by Jojic et al. as the condensed representation (meaning its size is only a fraction of the original size) of an image containing the essence of its the textural properties. This original epitomic model is based on a patch-based probabilistic approach, and has different applications in segmentation, denoising, recognition, indexing or texture synthesis.
Several epitomic models have been since proposed, such as the factorized representation of Wang et al. dedicated to texture mapping, or its extension designed for image coding purposes by Chérigui et al. The epitome is in this case the union of epitome charts which are pieces of repeatable textures found in the image. The search for self-similar or repeatable texture patterns, based on the KLT or a block matching (BM) algorithm, is known to be memory and time consuming.
In this work, we propose a clustering-based technique to reduce the self-similarities search complexity.

Approach

The main steps of the proposed scheme for scalable image coding are depicted in Fig. 1.

In the proposed scheme, the enhancement layer (EL) consists in an epitome of the input image. Consequently, at the decoder side, the EL patches not contained in the epitome are missing, but the corresponding base layer (BL) patches are known. We thus propose to restore the full enhancement layer by taking advantage of the known representative texture patches available in the EL epitome charts. (More explanations on the epitome generation are available here or in the papers listed at the end of this page.)

The epitomes are encoded with a scalable scheme as an enhancement layer. The blocks not belonging to the epitome are directly copied from the decoded base layer, thus their rate-cost is practically non-existent.

The non-epitome part of the enhancement layer is restored using methods derived from local learning-based super-resolution methods, and can be summarized in the three following steps: K-NN search, learning step, and processing step. These steps are shown in Fig. 2. A first method is proposed relying on Locally Linear Embedding, noted E-LLE. A second technique called E-LLM, based on Local Linear Mapping, is also studied.

Proposed scheme
Fig. 1 - Proposed scheme for scalable image coding.

Decoder side restoration
Fig. 2 - Main steps of the epitome-based restoration.

Experimental Results

The experiments are performed on the test images listed in Table I, obtained from the HEVC test sequences. The base layer images are obtained by down sampling the input image with a factor 2x2, using the SHVC down-sampling filter available with the SHM software (ver. 9.0). The BL images are encoded with HEVC, using the HM software (ver. 15.0). We then use the SHM software (ver. 9.0) to encode the corresponding enhancement layers. Both layers are encoded with the following quantization steps: QP = 22, 27, 32, 37.

For each input image, 3 to 4 epitomes of different sizes are genererated, ranging from 30% to 90% of the input image sizes.


Table 1 - Test images
ClassImageSize
BBasketballDrive1920x1080
BCactus1920x1080
BDucks1920x1080
BKimono1920x1080
BParkScene1920x1080
BTennis1920x1080
BTerrace1920x1080
CBasketballDrill832x480
CKeiba832x480
CMall832x480
CPartyScene832x480
DBasketballPass416x240
DBlowingBubbles416x240
DRaceHorses416x240
DSquare416x240
ECity1280x720

We show in Fig. 3 the Bjontegaard rate gains averaged over all sequences depending on the epitome size. The complete results are given in Table 2. We show in Fig. 4 the RD curve of the City image, which behavior is representative of the set of test images. We first show (left) the RD curve for both E-LLE and E-LLM methods with the biggest epitome size (best RD performances). We then show (right) the RD curve for the E-LLE with different epitome sizes.

RD performances depending on epitome size
Fig. 3 - Average RD performances of the different restoration methods against SHVC depending on the epitome size.


RD curve City depending on method RD curve City depending on epitome size
E-LLE and E-LLM methods, epitome size = 91.59% of input image. E-LLE method, with different epitome sizes.

Fig. 4 - RD performances of the City image.

We show in Fig. 5 the running time of the different methods for each image class (i.e. size) depending on the epitome size. On the left, we show the running time of the epitome generation at the encoder side. On the right, we show the running time of the restoration step at the decoder side. Note that the epitome generation algorithm was implemented in C++ while the restoration methods were implemented in Matlab.

Running time of epitome generation Running time of restoration
Epitome generation running time depending on the epitome size for different image classes. Post-processing running time of the different restoration methods depending on the epitome size for different image classes.

Fig. 5 - Running time of the different methods.

In the table below are listed the exhaustive Bjontegaard rate gains obtained with the proposed methods against SHVC.

Table 2 - Bjontegaard rate gains SHVC depending on the epitome size.
ImageEpitome size (% of input image)BD rate gains (%)
E-LLEE-LLM
BasketballDrive90.62-20.07-19.61
64.10-15.94-13.70
49.33-13.66-8.89
32.34-9.70 -0.08
Cactus 79.85-18.19-16.46
71.24-17.67-15.14
60.66-16.33-13.01
48.33-11.63-7.42
Ducks 89.63-19.52-19.07
77.41-16.71-14.21
48.282.88 10.28
Kimono 90.13-21.75-21.42
75.53-15.98-12.63
59.36-17.37-15.08
35.34-15.82-12.31
ParkScene 86.58-16.88-16.45
73.55-15.10-13.84
61.99-10.69-7.50
47.18-3.94 2.89
Tennis 64.49-23.04-21.91
50.44-22.03-19.61
43.12-19.90-16.59
32.22-18.42-13.13
Terrace 78.46-13.27-12.49
66.39-11.32-9.57
53.31-6.81 -3.01
43.50-0.50 9.03
City 91.59-10.05-8.76
82.44-6.24 -1.59
66.813.27 17.75
39.5228.00 59.96
BasketballDrill87.05-6.52-5.50
59.94-2.821.08
42.63-1.624.44
28.533.18 13.08
Keiba 93.59-6.71-6.42
81.28-3.69-1.99
63.533.24 7.75
40.7716.0623.52
Mall 92.95-18.20-16.76
76.28-0.50 -2.13
66.15-4.13 3.04
50.266.75 27.54
PartyScene 94.82-5.44-4.29
81.12-1.187.20
67.568.83 25.96
49.1326.8957.15
BasketballPass 77.76-16.17-13.15
66.60-14.07-7.25
56.41-0.53 10.48
42.315.72 28.21
BlowingBubbles 87.56-6.33-3.29
73.33-2.733.65
58.853.27 13.95
36.9216.8144.03
RaceHorses 91.03-16.08-15.67
79.23-4.49 -3.03
58.146.69 20.64
36.6723.45 63.23
Square 80.77-6.45-2.97
71.41-5.881.08
61.09-2.004.75
48.729.80 31.68

References