www.ijlemr.com || Volume 02 - Issue 06 || June 2017 || PP. 34-36

# A Review on Parallel LDPC Decoder Architecture

B. Swapna (ECE, GNITS / JNTUH, INDIA)

**Abstract:** Low Density Parity Check (LDPC) codes offer excellent error correcting performance and is being widely considered in next generation industry standards. The main challenge with implementing Parallel Decoder Architecture for LDPC codes is the interconnection of the functional units at the top level. For applications that require high throughput and low power dissipation and tolerate a fixed code format and large area, the Parallel Architecture is very suitable. The proposed Parallel Decoder Architecture supports a maximum Throughput with improved Bit Error Performance and provides Girth Optimization with simple interconnection. **Keywords:** BER performance, Error Correcting Code (ECC), Girth Optimization, Low-Density Parity Check Code (LDPC), Parallel Architectures, Throughput.

### I. Introduction

LDPC codes were first invented by Robert Gallager in 1960. LDPC were long time being ignored due to requirement of high complexity computation, introduction of Reed-Solomon codes and the concatenated RS and convolution codes being considered perfectly suitable for error control coding. They were rediscovered by Richardson/Urbanke in 1998 and MacKay in 1999 and shown their performance were close to the Shannon limit. Turbo codes and low-density parity-check codes are the two most popular ECC that have near Shannon limit performance. It has been shown that LDPC codes are asymptotically superior to turbo codes with respect to coding gain than a block-serial algorithm. A satisfying LDPC decoder usually means: good error correction performance, low hardware complexity and high throughput.

**Throughput** is the average rate of successful message delivery over a communication channel. **Girth** is an important property of LDPC codes, which is the length of the shortest cycle in the bipartite graph corresponding to a code. Mostly larger girths are preferred.

**Bit Error Rate** (BER) is the percentage of bits that have errors relative to the total number of bits received in a transmission.

LDPC codes are used in Wireless, Wired, and Optical Communications and home Networking, Digital Subscriber Lines. These codes are also used in multimedia, mobile communications, reliable communication in noisy channels and magnetic recording channels. These LDPC codes are used where there is a need to design codes that work with multi-level modulation (e.g. QAM or M-PSK).

### **Applications of Parallel LDPC Decoder:**

The Parallel Decoder Architecture is used to increase the throughput, to optimize the girth and to improve the Bit-Error Rate (BER) performance.

### II. Throughput

Tong Zhang and Keshab K.Parhi (2002) proposed the Parallel Decoder Architecture by applying the joint design methodology to improve the decoding throughput. This decoder contains  $k^2$  Variable Node processor Units (VNU's) and 3k Check Node processor Units (CNU's). To simplify the control logic design and reduce the memory bandwidth requirement, this decoder completes each iteration in 2L clock cycles in which CNU's and VNU's work in the  $l^{st}$  and  $2^{nd}$  L clock cycles, alternatively [6]. The proposed methodology can effectively increase the throughput of 54Mbps with maximum of 18 decoding iterations

Andrew.J Blanks by and Chris J.Howland (2002) proposed the message passing algorithm that maps extremely well to a parallel decoder architecture in which the graph is directly instantiated in hardware which means that each of the variable and check nodes are realized once in hardware and routed together as defined by the bipartite graph.

To perform each decoder iteration, the variable and check node functional units are used once and messages are exchanged between them along the routed message wires and allows all variable nodes or all check nodes to be updated in a block-parallel manner enabling a large number of decoder iterations to be performed during a block period allowing high throughput of 1Gbps performing 64 decoder iterations [1].

Junho Cho, Jonghong Kim (2009) proposed the Parallel Decoder Architecture by implementing a cyclic Projective-Geometric (PG-LDPC) code using Soft Bit Flipping Algorithm (SBF). Denoting the parallel factor by p, a parallel SBF decoder uses p times many VPUs and CPUs, while using the same number of node

ISSN: 2455-4847

www.ijlemr.com || Volume 02 - Issue 06 || June 2017 || PP. 34-36

registers, threshold adaptation modules, and interconnection networks as a serial decoder. The number of shift registers storing the variable and check nodes is not changed no matter how much p increases [3].

The Proposed Shared Node Processing Unit (SNPU) architecture combined with the pipelining technique achieves a throughput of 6.5 Gbps showing 0.6 dB close performance to the Sum Product Algorithm (SPA). Pipelining scheme can also employed to further increase the decoding throughput. The proposed method achieves a throughput of 6.5 Gbps with 87 decoding iterations.

## III. Girth Optimization

Jin Lu and Jose M.F.Moura (2004) introduced a method to design structured LDPC codes with large girth and flexible code rates. The method divides the nodes in the Tanner graph into groups and connecting nodes in these groups according to a set of parameters called shifts and proposed a class of structured LDPC codes with large girth and flexible code rate, called grouping-and-shifting based LDPC codes (GS-LDPC).

Let  $V_c$  be the set of all check nodes and  $V_b$  the set of all bit nodes. Divide  $V_c$  into  $N_c$  disjoint subsets of equal size provided that the code block length  $n = N_c * p$  where p is a natural number. Each subset is called a group and indexes the check nodes in each group from 0 to p - 1. Similarly, partition  $V_b$  into  $N_b$  disjoint groups of equal size and index the bit nodes in each group from 0 to p - 1. GS-LDPC codes should satisfy the following conditions:

Condition 1: Each check node is connected to k bit nodes that belong to k different groups.

Condition 2: Each bit node is connected to j check nodes that belong to j different groups.

Condition 3: The check node indexed by X in the  $y_{th}$  group in  $V_c$  is connected to the bit node

indexed by  $X \overset{\rlap{\rlap{\rlap{\rlap{\rlap{\rlap{\rlap{\rlap{\rule}}}}}}}}}{\oplus} S_{{\it y},z}$  in the  $z_{th}$  group in  $V_b$  where  $0 \leq S_{y,z} \leq p-1.$  The parameters  $S_{y,z}$  are named

shifts and  $\stackrel{\textcircled{+}}{=}$  represents modulo-p addition The proposed method provides a girth of 8 with a BER of  $5*10^{-8}$  at 0.6dB SNR [2].

Lu.J and Moura J.M.F (2005) proposed a high-rate low density parity check (LDPC) codes namely Partition and Shift Codes (PS-LDPC). When decoded by the iterative sum-product algorithm, they show performance close to Shannon Capacity. Since large girth improves the bit-error performance of the codes, leads to more efficient decoding, and guarantees large minimum distance.

The method to construct structured regular LDPC codes is based on balanced incomplete block designs (BIBD). BIBD-based codes are well structured, free of 4-cycles, i.e., their girth is 6, and achieves a very high code rate with good error correcting performance [5].

Lingyan Sun, Hongwei Song, B.V.K Vijaya Kumar (2005) proposed the performance of disjoint difference set (DDS)-based LDPC codes (with column weights 3, 4, 5) evaluated in additive white Gaussian noise (AWGN) channel using a high-speed field programmable gate array (FPGA) simulation platform. The proposed method provides a girth of 8 with BER performance of  $10^{-11}$ [4].

### IV. BER Performance

Andrew.J Blanks by and Chris J.Howland (2002) proposed to explore the performance and implementation issues of the parallel decoder architecture. The number of variable nodes is set by the block size, while the number of check nodes is determined by the code rate and the block size. The interconnection of the variable and check nodes is determined by the LDPC code itself where for each edge in the code graph physical nets must be instantiated to carry messages between the variable and check nodes. The decoder architecture also requires a method to load new data packets into the decoder and write out packets once they have been decoded with BER 10<sup>-6</sup> at 0.6 dB SNR [1].

Tong Zhang and Keshab K.Parhi (2002) proposed based on the post-routing static timing analysis, with the maximum 18 decoding iterations, this decoder supports a maximum symbol throughput of 54 Mbps and achieves BER  $10^{-6}$  at 2dB over AWGN channel [6].

Junho Cho, Jonghong Kim (2009) proposed Parallel Soft Bit Flipping decoders which are implemented using a cyclic PG-LDPC code. In order to reduce the area redundancy appeared in the Parallel SBF decoder architecture, a method to use a common processing unit is proposed instead of using two different node processing units. Since the cyclic SBF decoder requires very little area overhead for parallelization, a parallel decoder with a large parallel factor can be easily implemented for high throughput and BER of 10<sup>-4</sup> at 0.6 dB SNR [3].

The proposed method achieves a throughput of 6.5Gbps with only 2.5mm<sup>2</sup> area when implemented in a 0.18µm process, while providing 0.6 dB close performance to the floating-point SPA at the bit error rate of 10<sup>-4</sup>.

www.ijlemr.com || Volume 02 - Issue 06 || June 2017 || PP. 34-36

### V. Results And Discussion

The performance of the proposed Parallel Decoder Architecture in terms of the Throughput, Girth Optimization and improvement of Bit-Error Rate with respect to the Signal-to-Noise Ratio can be evaluated from the Table-1 and Table-2.

From Table-1 Soft-Bit-Flipping Decoder for PG-LDPC codes proposed by Junho Cho, (2009) is the best one among the three methods because the throughput is increased by 6.5.

The message passing algorithm proposed by Blanks A.J by and Howland C.J, (2002) achieves the throughput of 1Gbps with good BER performance and the number of iterations is 87.The Joint method proposed by Zhang.T (2002) achieves low throughput of 54Mbps but with good BER performance of 10<sup>-6</sup> and the number of iterations are 18.

| Decoding methods | Block Size | Throughput | BER              | SNR    | No. of     |
|------------------|------------|------------|------------------|--------|------------|
|                  |            | in bps     |                  | in dB  | Iterations |
| Joint            | 9216-bit   | 54 Mbps    | 10 <sup>-6</sup> | 2 dB   | 18         |
| MPA              | 1024-bit   | 1 Gbps     | 10 <sup>-6</sup> | 0.6 dB | 64         |
| SBF              | 1057-bit   | 6.5 Gbps   | 10 <sup>-4</sup> | 0.6 dB | 87         |

Table – 1: Comparison of Block Size, Throughput, Bit-Error Rate (BER), Signal-to-Noise Ratio (SNR) and No. of Iterations

| Different Techniques | Codes    | Girth | BER                |
|----------------------|----------|-------|--------------------|
| Structured           | GS-LDPC  | 8     | 5*10 <sup>-8</sup> |
| BIBD                 | PS-LDPC  | 6     | 10-7               |
| Error Floor          | DDS-LDPC | 8     | 10-11              |

Table – 2: Comparison of different Codes with Girth and Bit-Error Rate (BER)

Therefore, the Parallel Decoder proposed by Junho Cho, Jonghong Kim, (2009) can be the best among the three techniques, in terms of throughput.

From the Table-2 the technique proposed by Lingvan Sun (2005) is best among the three techniques having larger girth of 8 and with improved Bit Error Rate (BER) performance of 10<sup>-11</sup>. The Joint method proposed by Zhang.T (2002) can be considered to be best for good BER performance of 10<sup>-6</sup> with high SNR of 2 dB and with less number of decoding iterations.

#### VI. Conclusion

By comparison, it is clear that the Parallel Decoder Architecture proposed by three different authors are compared in terms of Throughput, Girth optimization, Bit-Error Rate (BER) and Signal-to-Noise Ratio (SNR). For applications that require high throughput and low power dissipation and can tolerate a fixed code format and large area, the parallel Architecture is very suitable. The Throughput is increased to 6.5 Gbps. The Signal-to-Noise Ratio (SNR) is increased to 0.6 dB with Bit-Error Rate (BER) of  $10^{-4}$ .

Junho Cho, Jonghong Kim, (2009) is the best one among the three techniques, for increasing the Throughput in LDPC Codes. The technique proposed by Lingyan Sun (2005) is best among the three techniques having larger girth of 8 with improved Bit Error Rate (BER) of  $10^{-11}$  in DDS- LDPC Codes

# References.

- [1] A.J by and Howland C.J, "A 690-mW 1-Gbps 1024-b, rate-1/2 Low-Density Parity-Check code decoder," IEEE J. Solid-State Circuits, vol. 37, No. 3, pp. 404–412, Mar. 2002.
- [2] Jin Lu Jose M. F. Moura "Grouping-and-shifting Designs for Structured LDPC Codes with Large Girth" July, pp.238, 2004.
- [3] Junho Cho, Jonghong Kim, Hyunwoo Ji, and Wonyong Sung "VLSI Implementation of a Soft Bit-Flipping Decoder for PG-LDPC Codes," IEEE, pp.908-911, 2009.
- [4] Lingyan Sun, Hongwei Song, B.V.K Vijaya Kumar "Error floor investigation and girth optimization for certain types of low-density parity check codes", IEEE,pp. 1101-1104, 2005.
- [5] Lu.J and Moura J.M.F, "Partition-and-shift LDPC codes," IEEE Trans. Magn., vol. 41, No. 10, pp. 2977–2979, Oct. 2005.
- [6] Zhang, T and Parhi, K, "A 54 Mbps (3, 6)-regular FPGA LDPC decoder," in Proc. IEEE Sips', pp. 127–132, 2002