# Design of Low Complexity Encoder for Capacitively Coupled VLSI Interconnects

Deepika Agarwal, G. Nagendra Babu, B. K. Kaushik and S. K. Manhas Microelectronics and VLSI Group, Department of Electronics and Computer Engineering Indian Institute of Technology, Roorkee Roorkee, INDIA

## ABSTRACT

In current Deep Submicron (DSM) era, interconnects play important role in overall performance of the chip. The factors such as power dissipation and crosstalk through RC modeled interconnects substantially affect the entire working of the chip. Therefore, to enhance the performance, minimization or elimination of coupling between interconnects is essential. Businvert method is the best method which can simultaneously reduce coupling and power consumption of interconnects. The proposed method focuses on designing low complexity encoder for 4, 8 and 16 bit RC coupled lines. The encoder proposed occupies 37% lesser area than the most popular encoder. The power consumption of this encoder is 68% lesser and the overall delay is also reduced by 57% compared to the existing encoders for RC modeled interconnects.

#### Keywords

Crosstalk, bus-invert, power consumption, overall delay.

# **1. INTRODUCTION**

As the feature size continues to shrink, clock frequencies tend to increase and interconnects play a significant role in determining the overall circuit performance in Deep Submicron (DSM) VLSI design [1], [2]. Shrinking feature size not only decrease channel length but also decreases the interconnects pitch and device threshold voltages. As the interconnect pitch is scaled down, spacing between interconnects is also scaled which leads to coupling. The factors which are increased due to interconnect effect are power dissipation and crosstalk. When one wire switches, it tends to affect its neighbor through capacitive coupling; this effect is called crosstalk. Due to crosstalk, the performance of the circuit is affected. As technology scales down to deep submicron, the crosstalk energy becomes the major component in bus energy dissipation. The parasitic capacitances which contribute to power consumption are gate capacitance, diffusion capacitance and interconnect capacitance. The interconnect capacitance is the dominant one among the three which consumes 51% of the total power dissipation. But other components such as gate capacitance and diffusion capacitance consume only 34% and 15% of power dissipation respectively. The dynamic power due to the interconnect capacitance can be greater than 50% of the total dynamic power. Hence minimizing the power consumption and crosstalk in interconnects is the most important design criteria in on-chip bus design [3]-[5].

There are different methods for the reduction of crosstalk such as repeater insertion [6], shielding line ( $V_{dd}$ /GND) insertion

between two adjacent wires [7], optimal spacing between signal lines and lastly the most effective bus encoding method [8]-[11] for reducing the crosstalk. Bus invert method decreases the activity factor by reducing number of transition which in turn reduces power dissipation. It consists of random distribution of sequence of data in which one extra line is used for redundancy. It considers all possible next values on the bus. When the number of transitions are more than half of the bus width, the original data is inverted and the control line is set to 'high', otherwise the original data is transmitted and control line is set to 'low'.

This paper focuses on reduction of power dissipation, crosstalk and area of the codec system. The proposed method reduces these parameters by decreasing switching activity or in turn coupling. The proposed method completely eliminates Type-4 coupling whereas others such as Type-3 and Type-2 are significantly removed.

The rest of the paper is organized as follows. Section 2 describes the classification of crosstalk, section 3 comprises of power dissipation expression and their dependence on different parameters. Working of the proposed method is explained in section 4. Section 5 discusses the experimental results of *RC* model. Finally, section 6 draws some important conclusions.

#### 2. CLASSIFICATION OF CROSSTALK

There are three parasitic capacitances associated with an interconnect *viz.*, ground capacitance ( $C_G$ ), fringe capacitance ( $C_F$ ) and coupling capacitance ( $C_C$ ). Coupling capacitance becomes dominant in *RC* modeled interconnects, when two adjacent lines are switching in the opposite direction, which causes crosstalk resulting in delay penalty which is called crosstalk delay. There are two important effects due to this crosstalk i.e. noise on non-switching wires and increased delay on switching wires.

Assume two lines namely A and B and their associated capacitances as shown in Table 1. The effective coupling capacitance ( $C_{eff}$ ) is evaluated, according to the behavior of neighboring wire. Table 1 shows the effective capacitance of line A( $C_{eff(A)}$ ) depending on the switching of line B (assume line A is switching). Based on switching conditions, the Miller coupling factor (MCF) is evaluated in the Table 1. When both lines are switching in the same direction then MCF is '0' and  $C_{eff(A)}$  is  $C_C$  whereas if only one line is switching then MCF is '1' and ( $C_{eff(A)}$ ) is ( $C_C+C_G$ ). Finally, when both lines are switching in opposite direction then the MCF is '2' which happens to be the worst case.

 Table 1. Effective capacitance of line 'A'

| Line 'B' | ΔV | C <sub>eff(A)</sub> | Miller<br>Coupling |
|----------|----|---------------------|--------------------|
|          |    |                     | Coupling           |

|                                  |            |              | Factor (MCF) |
|----------------------------------|------------|--------------|--------------|
| Switching with 'A'               | 0          | $C_G$        | 0            |
| Constant                         | $V_{dd}$   | $C_C + C_G$  | 1            |
| Switching<br>oppositely with 'A' | $2 V_{dd}$ | $2C_C + C_G$ | 2            |

In data bus there will be adjacent lines to the left and right side and so coupling capacitances of 3-bit configuration must be considered. Finally, all possible switching configurations can be classified as Type-0, Type-1, Type-2, Type-3 and Type-4 are summarized as shown in Table 2.

Table 2. Classification of crosstalk

| Type-0                                                                           | Type-1 | Type-2                       | Type-3 | Type-4 |  |
|----------------------------------------------------------------------------------|--------|------------------------------|--------|--------|--|
|                                                                                  | ↑      | - ↑ -                        | - ↑↓   | ↑↓↑    |  |
| $\uparrow\uparrow\uparrow$                                                       | - ↑↑   | ↑ - ↑                        | - ↓↑   | ↓↑↓    |  |
| $\downarrow\downarrow\downarrow\downarrow$                                       | ↑      | ↑ - ↓                        | ↑↓ -   |        |  |
|                                                                                  | ↑↑ -   | ↑↑↓                          | ↓↑ -   |        |  |
|                                                                                  | ↓      | ↑↓↓                          |        |        |  |
|                                                                                  | - ↓↓   | - ↓ -                        |        |        |  |
|                                                                                  | ↓      | ↓ - ↓                        |        |        |  |
|                                                                                  | ↓↓ -   | ↓ - ↑                        |        |        |  |
|                                                                                  |        | ↓↓↑                          |        |        |  |
|                                                                                  |        | $\downarrow\uparrow\uparrow$ |        |        |  |
| $\uparrow$ : switching from 0 to 1, $\downarrow$ : switching from 1 to 0, - : no |        |                              |        |        |  |

transition

### **3.POWER DISSIPATION EXPRESSION**

In reality, signal propagation through an interconnect takes time, consumes power, and might be unreliable. Energy is dissipated while transmitting information over interconnect through driver. The power dissipation is due to charging/discharging the interconnect in transitions between logic 0 and logic 1. The power dissipated can be expressed as

$$P = \alpha^* V_{dd}^2 * f^* C_{int} \tag{1}$$

where  $C_{int}$  is the interconnect capacitance,  $V_{dd}$  is supply voltage, *f* is the clock frequency,  $\alpha$  is the average activity factor whose value lies between 0 and 1. Generally, the parameters such as  $V_{dd}$ ,  $C_{int}$  can be optimized for low power dissipation. The power dissipation can be further reduced by reducing switching activity ( $\alpha$ ). Bus encoding technique relies on reducing switching activity. Symbols used throughout this paper are as follows:

B(t): Bus value at the input of encoder.

B(t)\_ENC : Encoder output which is transmitted.

B(t-1)\_ENC : Encoder output which was latched up.

inv(t): Invert line at the input of encoder which is presetted to '0'

INV(t): Invert line for the encoded data sent at time *t*. INV(t-1): Invert line for encoded data sent at time *t*-1.

#### IIVV(l-1). Invert line for encoded data sent at time l

# **4. PROPOSED METHOD**

An encoder is proposed which reduces crosstalk and power dissipation of *RC* modeled interconnect using bus invert method. Bus invert method [9] uses an extra line called invert pin INV(t) to differentiate the transmission of original and inverted data. The data bus chosen is of 8-bit width which is divided into two clusters.

Each cluster has 4-bit width with one additional control line. The encoder detects the crosstalk condition by comparing the present data with the previous data and depending on transition of the data bits a decision is made i.e., whether the input data is to be inverted or not. On the receiving side, original data is decoded by the decoder using the invert line INV(t). However, the architectures of the encoder and decoder should be of low complexity so that the power and delay overheads can be compensated.

The block diagram of proposed encoder is shown in Fig. 1. It consists of five major blocks which are Transition detector, Type-A detector, Type-B detector, latch and XOR stack.



Fig 1. Block diagram of proposed encoder

The first block is the transition detector which detects the transition by comparing the present data with the previous data which has been transmitted. The next step after detecting transition is the detection of crosstalk effect of these transitions. The proposed method employs two detectors i.e., Type-A detector to detect some of the Type-4 and Type-3 couplings and Type-B detector detects the remaining Type-4 and Type-3 couplings. Latch is used to store the encoded data for one clock cycle. After one clock cycle, the stored data i.e.  $(B(t-1)\_ENC, INV(t-1))$  and the data to be transmitted (B(t), inv(t)) are fed to the transition detector. Here the value of inv(t) in the data to be transmitted is assumed to be at logic 'low' initially. After the detection of crosstalk condition, INV(t) is generated by OR gate using the outputs of type-A and type-B detectors. The *INV(t)* pin goes 'high', only if there are one or more couplings (Type-4, Type-3). The encoded data is present at the output of XOR stack whose inputs are original data and INV(t).

#### 4.1 Transition Detector

Transition detector detects for the occurrence of transition by comparing the present data with the previous data using simple AND gates. The top 5 AND gates detects a 'high to low' transition  $(\downarrow)$  whereas bottom 5 AND gates detects a 'low to high' transition  $(\uparrow)$  as shown in Fig. 2. The output of transition detector becomes 'high' if there is any transition whereas it is 'low' for the remaining cases. It generates the output signals such as  $S_a$ ,  $S_b$  etc.. which acts as input to Type-A and Type-B detectors. These outputs are used to detect the crosstalk by connecting them logically using NAND gates.



Fig 2. Circuit diagram of transition detector

# 4.2 Type-4 and Type-3 Detector

As discussed in section 2, there are two cases of Type-4 coupling and four cases of Type-3 coupling (Table 2). The circuit diagram of type-A detector is shown in Fig.3.



Fig 3. Circuit diagram of Type-A detector

The top three NAND gates detects the second case  $(-\downarrow\uparrow)$  of Type-3 and first case  $(\uparrow\downarrow\uparrow)$  of Type-4 coupling. In these cases, the transition detector (Fig 2) sets one or more than one number of three combinations i.e.,  $(S_a, S_b, S_b), (S_b, S_c, S_i)$  and  $(S_c, S_d, S_j)$  to high. Similarly, bottom three NAND gates detects first case  $(-\uparrow\downarrow)$  of Type-3 and second case  $(\downarrow\uparrow\downarrow)$  of Type-4 coupling and sets one

or more than one number of three combinations i.e.  $(S_{j_i} S_g, S_c), (S_g, S_h, S_d)$  and  $(S_h, S_j, S_g)$ . The output from the final NAND gate i.e.,  $N_4$  goes 'high' if any of switching conditions i.e.,  $\{(-\uparrow\downarrow), (-\downarrow\uparrow), (\downarrow\uparrow\downarrow), (\downarrow\downarrow\uparrow)\}$  is detected by Type-A detector.



Fig 4. Circuit diagram of Type-B detector

Fig. 4 shows the detector which detects two cases of Type-3 coupling and all the cases of Type-4 coupling. The top three NAND gates detects fourth case  $(\downarrow\uparrow -)$  of Type-3 and second case  $(\downarrow\uparrow\downarrow)$  of Type-4 coupling. In these cases, the transition detector (Fig 2) sets one or more than one number of three combinations i.e.,  $(S_{fr} \ S_{br} \ S_c)$ ,  $(S_{gr} \ S_{cr} \ S_d)$  and  $(S_{hr} \ S_{dr} \ S_c)$  to high. Similarly, bottom three NAND gates detects third case  $(\uparrow\downarrow -)$  of Type-3 and first case  $(\uparrow\downarrow\uparrow)$  of Type-4 coupling and sets one or more than one number of three combinations i.e.,  $(S_{ar} \ S_{gr} \ S_{h})$ ,  $(S_{br} \ S_{hr} \ S_i)$  and  $(S_{cr} \ S_{gr} \ S_{h})$ ,  $(S_{br} \ S_{hr} \ S_i)$  and  $(S_{cr} \ S_{br} \ S_{fr})$ . The output from the final NAND gate i.e.,  $N_{-3}$  goes 'high' if any of switching conditions i.e.,  $\{(\downarrow\uparrow -), (\uparrow\downarrow -), (\downarrow\uparrow\downarrow), (\uparrow\downarrow\uparrow)\}$  is detected by Type-B detector.

## 4.3 XOR Stack

When either of  $N_4$  or  $N_3$  is 'high', the output of OR gate i.e., INV(t) becomes 'high' indicating a Type-4 or Type-3 coupling. The truth table of XOR stack is shown in Table 3. The original data is transmitted only if both  $N_4$  and  $N_3$  are 'low'. For all other cases inverted data is transmitted to eliminate crosstalk.

Table 3. Truth table of XOR stack

| N_4 | N_3 | OR gate<br>output<br>(INV(t)) | Encoded Data<br>(B(t)_ENC,INV(t)) |
|-----|-----|-------------------------------|-----------------------------------|
| 0   | 0   | 0                             | (B(t), 0)                         |
| 1   | 0   | 1                             | $(\overline{B(t)}, 1)$            |
| 0   | 1   | 1                             | $(\overline{B(t)}, 1)$            |
| 1   | 1   | 1                             | $(\overline{B(t)}, 1)$            |

Both XOR stack and decoder uses XOR gates which is having a property of transmitting the input directly if the other input is 'low' and on contrary transmitting complemented input if the other input is 'high'. The data bit is fed as one of the input and control line INV(t) is given as other input for the 2-input XOR gate. If the INV(t) line is 'high' then it indicates that the inverted data must be

transmitted to avoid the crosstalk and if it is 'low' which original data is to be transmitted.

#### 4.4 Decoder

The function of decoder is to decode the encoded data. The internal circuit of decoder is shown in Fig. 5.



Fig 5. Decoder

The encoded data  $(B(t)\_ENC)$  is fed as one of the input and control line INV(t) as the other in the 2-input XOR gate. If the INV(t) line is 'high' which indicates that the inverted data has been received, therefore,  $(B(t)\_ENC)$  should be inverted to retrieve the original data. In case INV(t) is 'low', the original data has been received and so, the decoder input is not inverted.

### **5. RESULTS**

The proposed method has been simulated in CMOS 180, 130 and 90 nm technologies using H-SPICE with pulse stimuli of having rise and fall times of 4ps. The length, width, thickness and spacing of the signal wire are 1300, 0.99, 0.53 and 1.37- $\mu$ m respectively. The proposed model shows significant reduction in power consumed, propagation delay, crosstalk and chip area as compared to Fan *et al.* [12].

## **5.1 Chip Area Reduction**

The proposed method has significantly reduced the chip area (57%) by the reduction in the number of transistors as compared to Fan *et al.* [12] as shown in Table 4.

|             | Proposed | Fan <i>et al</i> . |             |
|-------------|----------|--------------------|-------------|
| Components  | Method   | [12]               | % of saving |
| AND gate    | 2-input  | 4-input            | 50%         |
| 6-bit adder | 0        | 2                  |             |
| XOR gate    | 8        | 18                 | 55%         |
| Number of   |          |                    |             |
| transistors | 284      | 664                | 57%         |

| Table 4. Percentage saving in ter | rms of components |
|-----------------------------------|-------------------|
|-----------------------------------|-------------------|

Table 5. Savings in terms of area w.r.t. to Fan et al. [12]

| Technology Coding |                    | Area(µm <sup>2</sup> ) |       |        | % of Area       |
|-------------------|--------------------|------------------------|-------|--------|-----------------|
| ( <b>nm</b> )     | Methods            | 4 bit                  | 8 bit | 16 bit | saved (Average) |
| 180               | Fan <i>et al</i> . | 39.12                  | 82.17 | 172.56 | 26 250/         |
|                   | Proposed           | 24.90                  | 52.29 | 109.80 | 30.33%          |
| 130               | Fan <i>et al</i> . | 20.41                  | 44.89 | 103.25 | 35.84%          |
|                   | Proposed           | 12.99                  | 28.57 | 62.86  |                 |
| 90                | Fan <i>et al</i> . | 9.78                   | 20.54 | 43.13  | 26 120/         |
|                   | Proposed           | 6.22                   | 14.32 | 31.49  | 30.12%          |

Table 5. shows the comparison of the area of chip for different bit sized encoders. In 180-nm technology, chip size of the proposed design is 24.90  $\mu$ m<sup>2</sup> for 4-bit encoder. Although on increasing the width of the bus, the chip area will increase, but substantial decrease in crosstalk will be observed. Similarly, in 130nm and 90nm technologies the area of bus codec design results to 12.99, 6.22  $\mu$ m<sup>2</sup> respectively.

#### **5.2 Total Power Reduction**

The total power dissipation of the system includes the power dissipated by encoder, interconnects and the decoder. Fig. 6 shows comparison of power dissipation for different technologies.



Fig 6. Power dissipation of system for different technologies

for the 180, 130 and 90nm technologies is 1.8, 1.5 and 1.2 respectively.

 
 Table 6. Comparisons of power dissipation of proposed method with different technology

| Technology    | Coding     | Power Dissipation (µW) |       |        | % of power saved |
|---------------|------------|------------------------|-------|--------|------------------|
| ( <b>nm</b> ) | Methods    | 4 bit                  | 8 bit | 16 bit | (Average)        |
| 180           | Fan et al. | 28.32                  | 77.40 | 99.14  | 67.33%           |
|               | Proposed   | 9.26                   | 21.24 | 32.69  |                  |
| 130           | Fan et al. | 9.17                   | 24.24 | 32.12  | 70.40%           |
|               | Proposed   | 2.55                   | 7.26  | 10.19  |                  |
| 90            | Fan et al. | 5.39                   | 13.81 | 19.25  | 68 120/          |
|               | Proposed   | 1.81                   | 4.26  | 6.31   | 08.12%           |

The proposed design is implemented for 4, 8 and 16 bit data. For a same technology node, as the data bus width is increased the power dissipation increases as shown in Table 6. However, for all different combinations the power dissipation is still considerably

lesser than designs proposed in [12].

# **5.3 Total Propagation Delay**

The propagation delay in a victim line increases due to crosstalk. The encoder used for reducing crosstalk itself introduces some delay. Although there is a reduction in the propagation delay with the reduction of crosstalk, the overhead delay should also be considered. So, the encoder with low propagation delay is always demanded. The proposed design introduces less overhead delay as compared to existing encoders. The reduction of delay compared to Fan *et al.* [12] in 180, 130 and 90 nm technologies is 57.33%, 54.76% and 56.47% respectively.

# 6. CONCLUSION

This paper demonstrated the reduction in crosstalk, power dissipation and total propagation delay by using bus-invert method. The area occupied by the proposed encoder is much lesser than existing ones. The results shows a reduction in circuit area, power dissipation and propagation delay of 37.24%, 68.76% and 56.78% respectively compared to recently available capacitive modeled interconnects. The proposed method considered only Type-3 and Type-4 couplings because of their dominance in *RC* coupled interconnects. The reduction in Type-4 and Type-3 coupling is 100% and 76.8% respectively.

# 7. ACKNOWLEDGMENTS

The authors would like to thank Special Manpower Development Project (SMDP-II). Without its support, research would have been a lot more painful experience than it already is.

# 8.REFERENCES

- Cong, J., He, L., Khoo, K. Y., Koh, C. K., and Pan, D. Z. "Interconnect design for deep submicron ICs," in Proc. Int. Conf. Computer-Aided Design, pp. 478 – 485, Nov. 1997.
- [2] International Technology Roadmap for Semiconductors 2007.

- [3] Victor, B., and Keutzer, K. "Bus encoding to prevent crosstalk delay," in Proc. Int. Conf. on Computer-Aided Design, pp. 57-63, 2001.
- [4] Benini, L., Micheli, G. D., Macii, E., Sciuto, D., and Silviano, C. "Asymptotic zero-transition activity encoding for address busses in low- power microprocessor-based system," 7<sup>th</sup> Great Lakes Symp. on VLSI, Urbana, IL, USA, pp. 77-82, March 1997.
- [5] Lyuh, C. G., and Kim, T. "Low-power bus encoding with crosstalk delay elimination," IEEE proceedings-Computer and Digital Techniques, vol. 153, no. 2, pp. 93-100, March 2006.
- [6] Chandel, R., Sarkar, S., and Agarwal, R. P. "Repeater insertion in global interconnects in VLSI circuits," Microelectronics International, pp. 43-50, 2005.
- [7] Ghoneima, M., Ismail, Y. I., Khellah, M. M., Tschanz, J. W., and De, V. "Formal derivation of optimal active shielding for low-power on-chip buses," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 5, pp. 821-836, May 2006.
- [8] Khan, Z., Arslan, T., and Erdogan, A. T. "Low power system on chip bus encoding scheme with crosstalk noise reduction capability," IEEE Proceedings-Computers and Digital Techniques, vol. 153, no. 2, pp. 101 -108, March 2006.
- [9] Stan, M. R., and Burleson, W. P. Bus-Invert Coding for Lowpower I/O. IEEE Trans. on Very Large Scale Integration System, vol. 3, no. 1, pp. 49-58, March 1995.
- [10] Shin, Y., Chae, S. I., and Choi, K. Reduction of bus transitions with partial bus-invert coding. Electronics Letters, vol. 34, no. 7, pp.642-643, April 1998.
- [11] Shin, Y., Chae, S. I., and Choi, K. Partial bus-invert coding for power optimization of application-specific systems. IEEE Trans. on Very Large Scale Integration Systems, vol. 9, no. 2, pp. 377-383, April 2001.
- [12] Fan, C. P., and Fang, C. H. Efficient RC low-power bus encoding methods for crosstalk reduction. Integration VLSI Journal, Elsevier, vol. 44, no. 1, pp. 75-86, Jan. 2011.
- [13] Rabaey, J. M., Chandrakasan, A., and Nikolic, B. Digital Integrated Circuits. Prentice-Hall 2003.