# CMOS Layout Design and Performance Analysis for Synchronization Failures using 50nm Technology 

Ambresh Patel<br>IV Sem M.tech VLSI SSSCE, RGPV, Bhopal (M.P)

Anand Kumar Singh<br>Asst. Prof.EC Deptt. SSSCE, RGPV, Bhopal (M.P)

Sachin Bandewar<br>Asst. Prof.EC Deptt. SSSCE, RGPV, Bhopal (M.P)


#### Abstract

The synchronizer is constrained such that its state does not change when a latching operation fails. Therefore, any failed latching attempts are automatically retried in the subsequent cycles. For this we simulates the 8 bit multiplier, 4 bit 16 state finite state machine, 16 slot 8 bit data first in first out register etc. In a multi clock system, synchronizers are required when on-chip data cross the clock domain boundaries which guard against synchronization failures but introduce latency in processing the asynchronous input. We use method that hides synchronization latency by overlapping it with computation cycles Synchronous logic is designed such that state bit transitions have sufficient time to propagate to subsequent flip-flops by the time of the following clock edge. If one flip-flop k becomes metastable and produces a transition whose clock-to-q delays is longer than expected, this transition may not have sufficient time to reach all destination flip-flops.


## Keywords:

Synchronization failure, Setup time, Holds time, soft error rate, Flip-Flops, Metastability.

## 1. INTRODUCTION

The recent sequential circuits operate in different operating frequencies with large number of transistors communicating through asynchronous interfaces. The proper synchronization will prevents synchronization failures but introduce latency in processing the asynchronous input. We propose a synchronization circuits that hides synchronization latency by overlapping it with computation cycles. Synchronization failure i.e. metastability may occur in any synchronous circuit where an input signals can change arbitrarily with respect to a reference signal i.e. clock signal.

### 1.1 Transmission Gate Base Design:

Transmission Gate is generally used to implement of XORs and MUXs with the minimum number of transistors.


Fig: 1 Transmission Gate base FIFO.


Fig: 2 Transmission Gate base Finite state machine with 16 states.

Fig. 1 and fig. 2 shows the transmission gate base first in first out shifter logic and a 16 state finite state machine. For first in first out shifter logic, the output of one flip-flop is connected to the input of successive flip-flop. For the design of finite state machine the D flip-flop is converted to T flip-flop by connecting each flip-flop output to its respective input and each flip-flop is asynchronously clocked for its timing simulation.

| Present State | At CLK | Next State | $\mathrm{V}_{\text {out }}$ |
| :---: | :---: | :---: | :---: |
|  |  | $\mathrm{Q}_{3} \mathrm{Q}_{2} \mathrm{Q}_{1} \mathrm{Q}_{0}$ | Z |
| $\mathrm{Q}_{3} \mathrm{Q}_{2} \mathrm{Q}_{1} \mathrm{Q}_{0}$ |  | 0 |  |
| 0000 | -Ve edge | 0001 | 0 |
| 0001 | -Ve edge | 0010 | 1 |
| 0010 | -Ve edge | 0011 | 0 |
| 0011 | -Ve edge | 0100 | 1 |
| 0100 | -Ve edge | 0101 | 1 |
| 0101 | -Ve edge | 0110 | 0 |
| 0110 | -Ve edge | 0111 | 1 |
| 0111 | -Ve edge | 1000 | 0 |
| 1000 | -Ve edge | 1001 | 0 |
| 1001 | -Ve edge | 1010 | 1 |
| 1010 | -Ve edge | 1011 | 0 |
| 1011 | -Ve edge | 1100 | 1 |
| 1100 | -Ve edge | 1101 | 1 |
| 1101 | -Ve edge | 1110 | 0 |
| 1110 | -Ve edge | 1111 | 1 |
| 1111 | -Ve edge | 0000 | 0 |

Fig: 3 Transmission Gate Base Mealy state Table with 16 states.


Fig: 4 Transmission Gate Base Moore with 16 states.

### 1.2 Timing Metrics:

The propagation delay of sequential circuits is the sum of propagation delay of master and slave logic during transparency. When the metastable failure occurs in master latch, it will increase its propagation. If one flip-flop in sequential circuits becomes metastable and produces a transition propagation delay is longer than expected, this transition may not have sufficient time to reach all destination flip-flops. A successive flip-flop captures this value while other flip-flops capture the old value. This cause the wrong data propagation in sequential circuits. If these errors occur in sequential circuits like counters then the system may transition into an unknown and possibly-unrecoverable state.

If this synchronization failure occurs due to the setup-hold time violation in one flip-flop then next flip-flop may become metastable in the following cycle and exhibit a prolonged propagation delay on its own. Synchronization failure like metastability is a occurrence where the bi-stable element such as flip-flop enters an unwanted third state in which the output is at an intermediate level between logic " 0 " and " 1. " The basic flip-flop timing parameters are clock-to-output (Clk-Q) delay, data-to-output delay, and setup and hold times [4, 5].

## These are as follows

1) The Clk to Q delay is the propagation delay measured from the active clock edge to the output.
2) The input to output delay TDQ, min, is the propagation delay from input of flip-flop to its output.
3) The setup time tsetup, which is the optimum data-to- clock delay that have stable input.
4) The hold time thold, which is the clock-to-data delay that leads to a $5 \%$ increment of clock-to-data delay TCQ with respect to TCQ, min.
5) The worst race conditions are in the event that there is no logic between the two flip-flops. The internal race immunity R of a flip-flop is given by

$$
\mathrm{R}=\mathrm{t}_{\mathrm{CLK}-\mathrm{Q}}-\mathrm{t}_{\text {hold }}>\mathrm{t}_{\text {skew }}
$$



Fig: 5 Flip-flop waveforms in the case of stable and metastable outputs.

### 1.3 Propagation Delay:

The propagation delay of a flip-flop is defined as the time requires to halfly charging of hafly discharging the output load capacitor. Generally the propagation delay varies from low to high transition and high to low transition.

### 1.4 Setup Time:

Synchronous machines are clocked at a fixed rate. At every edge of clock signal the data transition depends on the propagation delay of combinational circuits through which this data propagates. The setup time is defined as the time the input of flip-flop must remain stable at the active edge of clock so that the flip-flop retains the proper output value.

### 1.5 Hold Time:

After the clock signal has changed, the input must be hold for a period of time to allow the signal to propagate through the flip-flop for ensuring a stable output. This delay time is called hold time.

### 1.6 Avoid Synchronization Failure:

In sequential circuit having more than on flip-flop in series synchronies with one clock should be running on the same edge of system clock as the rest of the circuit. This will bound the area problems to one path instead of several, and reduce the possibility of synchronization failure entering circuit. The synchronization failure is avoided by making the clock period as long enough to allow for the resolution of stable output and for the delay of whatever logic may be in the path to the next flip-flop can minimize by considering the transition time of input should be at least equal or more than the propagation delay of entire circuit.


Fig: 6 Sequential circuit synchronization


Fig: 7 Sequential element base circuits.


Fig: 8 Sequential element base circuits in the case of stable and metastable outputs.

The two flip-flops are asynchronies with clock pulse shown in fig above. Combinational logic is placed between the sequencing elements to enforce sequence, to distinguish the current output from the previous or next output. The output of first flip-flop synchronizes the output of second flip-flop. The timing simulation shows the synchronization failure because the clock period is not long enough to allow for the resolution of stable output and for the delay due to combinational circuit connected between two flip-fops of is not equal or more than the propagation delay of entire circuit.

## 2. PERFORMANCE METRICS:

The performance of sequential circuit is measure by PowerDelay Product (PDP). The (PDP) is the product of propagation delay time and power dissipation, taking both metrics into account. PDP is calculated as:
$P D P=\boldsymbol{t}_{\text {delay }} \cdot \boldsymbol{P}$
Where $\mathrm{t}_{\text {delay }}$ is the delay and P is the power consumption.
The propagation delay of single MOSFET is calculated as
$\boldsymbol{t}_{\text {delay }}=\boldsymbol{K} \boldsymbol{C}_{L} / \boldsymbol{U}_{\text {Cox }} \boldsymbol{W} / \boldsymbol{L} \operatorname{Vdd}$
Where K is constant,
$\mathrm{U}_{\mathrm{Cox}}$ is mobility of charge carrier,
W/L is the ration of channel width and length,
Vdd is supply voltage.

## 3. DESIGN METHODOLOGY:

Design methodology to avoid synchronization failure is to add one or more series connected synchronizing flip-flops to the synchronizer. This does, however, increase the latency in the synchronous logic's observation of input changes. Use of faster flip-flops decreases the setup and hold times, which in turn decreases the time window that the flip-flop is vulnerable to synchronization failure. When the input frequency is
decreased, the chances of the input changing during the setup and hold time also decreases [6, 7, and 8].

Synchronization failure can be reduce by using faster flipflops. The latch and flip-flops are design using transmission gate which requires less number of transistors. Transmission Gate has the capability of a high-quality switch with small resistance and capacitance. Transmission gate is the part of our design module. The delay of the transmission gate can be modelled by linearized RC network. The on-resistance and diffusion capacitance of transmission gate is represented by a resistor. For the purpose delay analysis each transistor is model as resistor in series with an ideal switch. The value or resistance is depends on the power supply voltage and an equivalent large signal resistance, scale by the ratio of device width over the length. The propagation delay of the network excited by the step function is proportional to the time constant of the network. In this case time constant is the product of the resistor and load capacitor. Hence propagation delay for low to high transistor at $50 \%$ reach is
$t_{P H L}=\ln _{2} \tau=0.69 \tau=0.69$ Ron $C_{L}$
Where Ronn, Ronp are the on resistance of the NMOS and PMOS

$$
\begin{aligned}
& \text { Ronp }=1 / \beta(\mathrm{Vgs}-\mathrm{Vt})_{\mathrm{p}} \\
& \text { Ronn }=1 / \beta(\mathrm{Vgs}-\mathrm{Vt})_{\mathrm{n}}
\end{aligned}
$$

The overall propagation delay of the inverter is

$$
\begin{aligned}
t p & =\left(t_{P H L}+t_{P L H)} / 2\right. \\
& =0.69 C_{L}\{\text { Ronn }+ \text { Ronp }\} / 2
\end{aligned}
$$

### 3.1 Design simulation:

Micro wind layout simulator is use to design the latches and flip-flops and to calculates the parametric analysis such as power , switching delays, number of transistors, data and clock frequencies etc.
Micro wind layout simulator is use to design the latches and flip-flops and to calculates the parametric analysis such as power , switching delays, number of transistors, data and clock frequencies etc. The central idea of our technique is to prevent corrupt values from proceeding through the pipeline by preventing late transitions of any synchronizer flip-flop from being captured by its successor. This is done by inserting delay elements between the synchronizer flip-flops. Sufficient delays can guarantee that late transitions of any synchronizer flip-flop Si , which may not meet the setup condition of the register Ri will necessarily fail to be captured by $\mathrm{Si}+1$. Therefore, if the setup condition of Ri is not met, $\mathrm{Si}+1$ will not transition to logic high in the following cycle. This will "stall" the pipeline for one cycle, in which Ri will relatch its input correctly, before the pipeline latching sequence proceeds. Synchronization time can be reduced by using faster (lower $\tau$ ) flip-flops. When the input frequency is decreased, the chances of the input changing during the setup and hold time also decreases [1, 2, and 3].
We use the synchronizer as a state machine to sequence a series of latching operations. The synchronizer is constrained such that its state does not change when a latching operation fails. Therefore, any failed latching attempts are automatically retried in the subsequent cycles. For this we simulates the 8 bit multiplier, 4 bit 16 state finite state machine, 16 slot 8 bit data first in first out register etc.

The fig. 9 below shows is the multiplexing logic circuit use in our latch circuit. In this circuit the two transmission gates, in which first transmission gate transmit the input towards the output when clock is active otherwise it latch the previous output through second transmission gate.


Fig: 9 Transmission gate base latch cell.
Micro wind simulations were performed in order to quantify the delay, power and metastability performance of several flip-flops base modules. A CMOS layout is also implemented in MICROWIND Layout editor to represents the delay degradation due to metastability which affects the performance of circuits including timing simulation, power dissipation etc. We will also try to improve the switching time of MOSFET device [4, 5].


Fig: 10 Layout design for T Flip-flop.
In this figure. 10, the delay flip-flop (D flip-flop) is used to design T flip-flop, this can be done by connecting complemented Q output to the D input as shown in above figure layout. This type of flip-flops can be interpreted as a primitive delay line or zero-order hold, since the data is posted at the output one clock cycle after it arrives at the input. The layout for T flip-flop is design using the master slave latch arrangement. Depending on the edge of clock signal only one latch is activate at one time.


Fig: 11 Timing simulation for T Flip-flop.
The timing simulation in fig. 11 shows the synchronize edge trigger operation of Toggle flip-flop. In timing simulation of T flip-flop, when the reset is active then output Q is at logic ' 0 ' level otherwise output depends on the negative edge trigger of clock signal. At every negative edge the output toggles from its previous value. For a positive-edge triggered master-slave T flip-flop, when the clock signal is low (logical 0) the "enable" seen by the first or "master" T latch (the inverted clock signal) is high (logical 1). This allows the "master" latch to store the input value when the clock signal transitions from low to high. As the clock signal goes high (0 to 1 ) the inverted "enable" of the first latch goes low ( 1 to 0 ) and the value seen at the input to the master latch is "locked". This allows the signal captured at the rising edge of the clock by the now "locked" master latch to pass through the "slave" latch. When the clock signal returns to low ( 1 to 0 ), the output of the "slave" latch is "locked", and the value seen at the last rising edge of the clock is held while the "master" latch begins to accept new values in preparation for the next rising clock edge.


Fig: 12 CMOS layout for synchronize Counter.


Fig: 13 Timing simulation of Counter 8 bit.
This 8 bit asynchronous counter is design by using four $T$ register. The all eight toggle flip-flops shown in above figure is design with 184 number of transistor including 96 NMOS and 88 PMOS transistor. The voltage verses time simulation is done on 20 nm scale.


Fig 14 Number of transistor and power overhead comparison

Table 1 Comparative analysis with reference.

|  | Ghaith Tarawneh, Alex <br> Yakovlev, and Terrence Mak |  | Our work |
| :---: | :---: | :---: | :---: |
| Data <br> Path | Discription | Power dissipation | Power dissipation uW |
| Counter | 8-b binary counter | 322 | 2.6 |
| Multi | 8-b by 8-b multiplier | 496.8 | 4.1 to 4.8 |
| CRC8 | 4-b crc 8-b data item | 187.2 | 1.4 to 1.9 |
| CRC16 | 8-b crc 16-b data item | 349.7 | 2.6 to 3.6 |
| fsm16 | finite state machine with 16 states | 32.9 | 1.6 |
| fifo16 | 16-slot FIFO, 8-b data item | 325.7 | 45 to 56 |

The graphical analysis in fig. 14 compares the number of transistors and average power dissipation with the related work in [1] and [2]. The transmission gate base latch requires less power consumption as compare to related work with the delay trade off.

## 4. CONCLUSION

It is impossible to completely prevent the synchronization failures but their probability can be reduced to an acceptable level by re-sampling the input signal through a cascade of flipflops known as a synchronizer. This can be done by giving enough time for metastabilty to resolve to valid logic states before being interpreted by other circuits. Due to the setup and hold time violation the latch may have no initial voltage to amplify and thus the output of the flip-flop may become unpredictable and take an unbounded amount of time to settle to a stable level. The power consumption of design circuits is optimize up to 2.6 uW for 8 bit binary counter as compare to the design in reference paper power consumption. While the power dissipation of 16 state finite state machines, First in first out register is reduced from 32.9 uW and 325.7 uW to 1.6 uW and 45 uW respectively. The proposed solution uses a transmission gate base circuit to design the latch and flip flops which can reduce the number of transistors, stray capacitances and improve delays performance and optimize synchronizer chain length dynamically.

## 5. REFERENCES

[1] Ghaith Tarawneh, Alex Yakovlev, and Terrence Mak "Eliminating Synchronization Latency Using Sequenced Latching" IEEE Transactions On Very Large Scale Integration (VISI) Systems, Vol. 22, No. 2, February 2014 pp no. 408-419.
[2] Pedro M. Figueiredo "Comparator Metastability in the Presence of Noise" IEEE Transactions on Circuits and Systems-I: Regular Papers,Vol. 60, No. 5, May 2012 pp no 1286-1299.
[3] David Rennie, DavidLi, Manoj Sachdev, Bharat L. Bhuva, Srikanth Jagannathan, ShiJie Wen, and Richard Wong "Performance, Metastability, and Soft-Error Robustness Trade-offs for Flip-Flops in 40 nm CMOS" IEEE Transactions On Circuits And Systems-I: Regular Papers, Vol. 59, No. 8, August 2012 pp no 1626-1634.
[4] Haiqing Nan and Ken Choi "High Performance, Low Cost, and Robust Soft Error Tolerant Latch Designs for Nanoscale CMOS Technology" IEEE Transactions on Circuits and Systems-I: Regular Papers, Vol. 59, No. 7, July 2012 pp no 1445-1457.
[5] David J. Rennie,and Manoj Sachdev "Novel Soft Error Robust Flip-Flops in 65nm CMOS" IEEE Transactions On Nuclear Science, Vol. 58, No. 5, October 2011 pp no. 2470-2476.
[6] Mr Jun Zhou,Mr David J. Kinniment,Mr Charles E. Dike, and Mr Gordon Russell "On-Chip Measurement of Deep Metastability in Synchronizers" IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL.43,NO.2, in February 2008 pp no. 550-557.
[7] Keith A. Bowman, James W. Tschanz, Nam Sung Kim, Janice C. Lee, Chris B. Wilkerson, Shih-Lien L. Lu, Tanay Karnik, and Vivek K. De "Energy-Efficient and Metastability-Immune Resilient Circuits for Dynamic Variation Tolerance" IEEE Journal Of Solid-State Circuits, Vol. 44, No. 1, January 2009 pp no. 49 -63.
[8] Antonio Cantoni, Jacqueline Walker, and Toby-Daniel Tomlin "Characterization of a Flip-Flop Metastability Measurement Method" IEEE Transactions on Circuits and Systems-I: Regular Papers, Vol. 54, No. 5, May 2007 pp no. 1032-1040.
[9] David J. Kinniment, Charles E. Dike, Keith Heron, Gordon Russell and Alexandre V. Yakovlev " Measuring Deep Metastability and Its Effect on Synchronizer Performance" IEEE Transactions on very large scale integration (VLSI) Systems, Vol. 15, No. 9, September 2007, pp no. 1028-1039.
[10] Stephen E. Paynter, Neil Henderson and James M. Armstrong " Metastability in Asynchronous Wait- Free Protocols" IEEE Transactions on Computers, Vol. 55, No. 3, March 2006 pp no. 292 - 303.
[11] Pradeep Verma, B.S.Panwar and K.N.Ramganesh "Brief Contributions" IEEE Transactions on Computers, Vol. 53, No 9, September 2004 pp no. 1200 - 1204.

