**Post: #1**

VLSI Architecture of Arithmetic Coder Used in SPIHT

1VLSI Architecture.pdf (Size: 1.21 MB / Downloads: 72)

Abstract

A high-throughput memory-efficient arithmetic

coder architecture for the set partitioning in hierarchical trees

(SPIHT) image compression is proposed based on a simple context

model in this paper. The architecture benefits from various optimizations

performed at different levels of arithmetic coding from

higher algorithm abstraction to lower circuits’ implementations.

First, the complex context model used by software is mitigated

by designing a simple context model, which just uses the brother

nodes’ states in the coding zerotree of SPIHT to form context symbols

for the arithmetic coding. The simple context model results

in a regular access pattern during reading the wavelet transform

coefficients, which is convenient to the hardware implementation,

but at a cost of slight performance loss. Second, in order to avoid

rescanning the wavelet transform coefficients, a breadth first

search SPIHT without lists algorithm is used instead of SPIHT

with lists algorithm. Especially, the coding bit-planes of each zero

tree are processed in parallel. Third, an out-of-order execution

mechanism for different types of context is proposed that can

allocate the context symbol to the idle arithmetic coding core with

a different order that of the input. For the balance of the input rate

of the wavelet coefficients, eight arithmetic coders are replicated in

the compression system. And in one arithmetic coder, there exists

four cores to process different contexts. Fourth, several dedicated

circuits are designed to further improve the throughput of the

architecture.

INTRODUCTION

AS arithmetic coding (AC) [1], [2] method can obtain optimal

performance for its ability to generate codes with

fractional bits, it is widely used by various image compression

algorithms, such as the QM in JPEG [3], the MQ in JPEG2000

[4]–[6], and the context-based adaptive binary arithmetic coder

(CABAC) [7], [8] in H.264. Especially, the set partitioning in

hierarchical trees (SPIHT) [9] uses an AC method to improve

its peak signal-to-noise ratio (PSNR) about 0.5 dB. Although

the theory and program code of AC are mature, the complicated

internal operations of AC limit its application for some real time

fields, such as satellite image and high speed camera image compressions.

In order to achieve performance gains, high speed architecture

of AC in compression scenarios must be designed to

meet the throughput requirement.

Thus both industrial and academic research groups have put

their efforts to AC hardware architectures for various image

compression systems. However, there are two main challenges

in hardware architecture design for high speed applications. One

is data dependencies in AC which require the result of iteration

before next run can commence during the adaptive model update

and internal loops. The other one is that AC requires increasingly

greater precision as more data arrive. In order to deal

with such difficulties, several architectures are proposed in the

past years.

SPIHT Image Compression

SPIHT with lists algorithm uses three different lists to store

significant information of wavelet coefficients for image coding

purpose. Three lists are list of insignificant sets (LIS), list of

insignificant pixels (LIP), and list of significant pixels (LSP).

At first, SPIHT combines nodes of a coefficient tree in wavelet

domain and its successor nodes into one set which is denoted

as insignificant. With traveling each tree node, sets in the LIS

are partitioned into four different subsets which are tested for

significant state.

Challenges With SPIHT Arithmetic Coding Architecture

As far as hardware architecture of arithmetic coder in SPIHT

is concerned, there are three main challenges for designers to

solve during real-time implementation. First, a large amount of

coding symbols is supplied to arithmetic coder which can be

a bottleneck for high speed real-time applications. Because the

scheme of SPIHT is a bit-plane based method, which codes each

bit-plane from the most significant bit-plane (MSB) to the least

significant bit-plane (LSB) sequentially, the quantity of context

symbols for arithmetic coder will be proportional to the coded

planes that are determined by the maximal wavelet coefficient.

For the 9/7 wavelet filter, the precision of wavelet coefficient

will be increased compared to the pixel precision after transformation

stage. Therefore in order to keep speed balance between

the wavelet transformation and the arithmetic coder, the

throughput of arithmetic coder must match the input rate of the

wavelet stage. For the design of arithmetic coder, we test some

typical images with different pixel precision and compression

ratio using the QccPackSPIHT software.

EXPERIMENTAL RESULTS

Software Results

The experimental results come in two folds, i.e., software and

hardware. First, PSNR results for typical images using different

SPIHT methods are recorded, including SPIHT with arithmetic,

SPIHT without arithmetic, SPIHT without lists and arithmetic

and our SPIHT prototype. Table VI lists the detailed data. From

the results, SPIHT-HW is slightly lower than SPIHT-AC as the

precision is limited during the wavelet transform and a simple

context model is involved.

CONCLUSION

Arithmetic coding makes itself a standard technique for its

high efficiency. However, as far as hardware implementation is

concerned, the complexity of calculation limits AC in the filed

of high speed real-time coding. For improvement of throughput

purpose, we propose a high speed architecture of AC used in

SPIHT without lists algorithm. In the architecture, a simple context

scheme is used first to reduce the memory size. Then high

speed calculation units are employed for speedup purpose. Especially,

a power control module can reduce the power dissipation

efficiently. It is a high parallelism and calculation device

that makes the speed of context processing fast. From the simulation

results, our AC architecture can meet many high speed

image compression requirements. And the degradation of performance

incurred by the fixed point calculation is slight.