Asynchronous reset synchronization and distribution – ASICs and FPGAs

Lack of coordination between asynchronous resets and synchronous logic clocks leads to intermittent failures on power up. In this series of articles, we discuss the requirements and challenges of asynchronous reset and explore advanced solutions for ASIC vs FPGA designs.

Asynchronous resets are traditionally employed in VLSI designs for bringing synchronous circuitry to a known state after power up. Asynchronous reset release operation must be coordinated with the synchronous logic clock signal to eliminate synchronization failures due to possible contention between the reset and the clock. A lack of such coordination leads to intermittent failures on power up. The problem exacerbates when large, multiple-clock domain designs are considered. In addition to the synchronization issues, the distribution of an asynchronous reset to millions of flip-flops is challenging, calling for techniques similar to CTS (Clock Tree Synthesis) and requiring similar area and routing resources.

The requirements and challenges of asynchronous reset are reviewed, focusing on synchronization and distribution issues. The drawbacks of classic solutions for reset synchronization (reset tree source synchronization) and distribution (reset tree synthesis) are discussed. Advanced solutions for faster and simpler timing convergence and more reliable reset synchronization and distribution are presented. Different approaches for ASIC versus FPGA designs are detailed.

Part 1 describes the issues surrounding asynchronous resets and outlines approaches for resolving those issues. Part 2 (this article) discusses additional solutions for correct asynchronous reset in ASIC and FPGA. Some useful special cases are discussed in Part 3.

2. Asynchronous reset timing convergence techniques

One of the main issues discussed in Part 1 was the complexity of reset release for large designs (with a high latency reset distribution network), especially when a short clock cycle is employed. The timing convergence based on standard STA optimization leads to an expensive design and in some cases is even impossible. Here we discuss two techniques that mitigate this timing issue. Both techniques are applicable for ASIC and for FPGA designs.

2.1. Asynchronous reset pipelining

One way to deal with the timing issue of asynchronous reset release is to trade off the reset release latency for a more relaxed timing. This can be achieved by pipelining the reset tree in the following way. After each synchronizer an additional asynchronous-set flip-flop stage P1 is included on the reset line (Figure 6a). Both SET and D inputs of the flip-flop are connected to the active high reset RSTO coming from the reset synchronizer. On the RSTO release, the setup and hold conditions are satisfied for P1 D and SET inputs since are constrained as a regular synchronous paths.

click for larger image

Figure 6: Asynchronous reset with pipelining (Source: vSync Circuits)

The functional operation of the new scheme is similar to the regular one described in Part ‎1 (Figure 3d), except for an additional single cycle latency on the reset release. The higher reset latency incurred by this technique is usually acceptable for most applications, as it is incurred only once per power up.

A complementary part of this technique covers design constraints. While the synchronizer flip-flops must be constrained against duplication in order to prevent re-convergence path issues as described in Part ‎1, the pipeline stage P1 is subject to MAX_FANOUT constraining. An example of maximal fanout constraint is shown in Figure 6b. The P1 flip-flop is automatically duplicated by synthesis tool, creating four sub-networks for the reset distribution. Each sub-network has a lower latency than the original network, meeting the timing requirement for the reset release. In addition, the output of the reset synchronizer easily meets fanout of eight.

This asynchronous reset pipelining technique is scalable for any design size and requires no changes when the design is changing, as the synthesis tool duplicates automatically the P1 stage, keeping the moderate-latency reset sub-net bounded. The duplicated P1 stage fanout for synchronizer output is usually small and does not cause timing violations. However, when a single pipeline stage does not lead to timing convergence, additional pipeline stages P2 – PN can be included and constrained with different MAX_FANOUT constraints.

An example of this technique applied to a real design is shown in Figure 7. The P1 stage register, named PORT6, was automatically duplicated about 40 times by the synthesis tool ‎[6] to meet the MAX_FANOUT constraint. Each of the 40 sub-nets met timing for its local fanout.

click for larger image

Figure 7: Example of asynchronous reset pipelining (Source: vSync Circuits)

2.2. Asynchronous reset clock-gating

Another technique for solving the high-fanout reset network timing issue employs clock-gating. This technique also trades reset latency for easier timing convergence.

Asynchronous reset clock-gating is shown in Figure 8a. The incoming asynchronous reset, RSTI, is first synchronized by a standard Reset Synchronizer, generating a reset-released synchronized reset, named RST_CLK. The RST_CLK asynchronous reset is connected to a small “Reset FSM”, which is responsible for gating the clock to the entire design (except for the FSM itself). Clock gating is done by means of ICG cell (integrated clock gating cell), producing gated clock, CLK_G. Since the FSM contains a very few flip-flops, it incurs no timing violation for reset release of RST_CLK reset signal generated by the Reset Synchronizer.

click for larger image

Figure 8: Asynchronous reset with clock-gating. (a) architecture (b) FSM for reset and clock control (c) wave diagram of the operation (Source: vSync Circuits)

The operation is described by means of the state machine shown in Figure 8b and the waveform diagram in Figure 8c. At the beginning, RSTI is asserted, asserting asynchronous reset for the entire chip, including Reset FSM. Note that at this stage the clock CLK may be inactive. Due to the long reset network latency, it may take time for the reset to spread through the chip, but eventually all flip-flops are reset. The Reset FSM is reset to RST_ST state.

Upon RSTI release, the Reset Synchronizer aligns the reset release of RST_CLK to the CLK rising edge, preventing timing violation for Reset FSM. On the next cycle, the FSM moves to COUNT_ST state, synchronously releasing RST_CLK_G reset to the design. Since CLK_G clock is gated at this state, there is no timing violation during this event. The counter threshold is set to cover for a worst case reset propagation time over reset distribution network. Once the counter reaches its threshold, the FSM moves to the next state, FINISH_ST, in which CE is set high, releasing the clock to the rest of the design. Since at this stage the RST_CLK_G is stable at all its leaves (RST_CLK_G_LEAF), there are no timing violation incurred.

This technique minimizes the resources needed for the reset distribution network. The network can be imbalanced, employ small buffers and have a long latency. A good engineering practice is to set a multi-cycle constraint on the path from RST_CLK_G port towards all the driven flip-flops, matching the counter threshold of the FSM. Since the threshold can be arbitrary high, this constraint is easily met.

In addition to mitigating the timing issues, this technique can reduce the maximal current drawn during reset network switching. The reset network can be designed intentionally imbalanced, reducing concurrent switching of reset network branches and thus minimizing total current.

A similar approach can be employed for high-fanout synchronous resets, as described in the next section 2.3.

2.3. Synchronous reset clock-gating

A synchronous reset distribution network has the same fanout as its asynchronous counterpart. Thus, it suffers from the problem of timing convergence, when in the synchronous reset case correct timing must be ensured both on reset assertion and on reset release. Similarly to section ‎2.2, clock gating is proposed to solve the problem for synchronous reset.

The architecture is shown in Figure 9a. Reset is performed upon a request RST_REQ, which can be a synchronous derivation of asynchronous external reset (synchronized by a Reset Synchronizer for both assertion and release). The Reset FSM state diagram is shown in Figure 9b, as follows:

  1. Upon reset request RST_REQ, clock CLK_G is gated (CDIS1_ST). A few wait cycles may be employed to ensure that the clock has stopped for the entire design.

  2. On state RSTS_ST, reset RST_CLK_G is asserted. Since CLK_G is gated off, this event does not violate timing. The reset tree is assumed to be imbalanced and of a high delay, thus it takes time for the reset assertion edge to reach all the flip-flops (RST_CLK_G_LEAF). During this time (and slightly after) the clock is still gated assuring no timing violations (see also Figure 9c).

  3. On state CEN1_ST, CLK_G is enabled. Since RST_CLK_G_LEAF signal is already stable, all flip-flops are synchronously reset without timing violations. A few cycles might be needed, as some synchronous logic may require more than one cycle for synchronous reset.

  4. Once the design is reset, the FSM gates the clock again in state CDIS2_ST.

  5. Since the clock is not ticking, it is safe now to release the reset on RST_CLK_G. As previously for reset assertion, the FSM waits for a few cycles to ensure that the reset release has reached all flip-flops in the design.

  6. At the last state, CEN2_ST, CLK_G is enabled and normal operation is started.

As previously, a weak, imbalanced reset network may be employed, with a multi-cycle constraint for P&R.

click for larger image

Figure 9: Synchronous reset with clock-gating. (a) architecture (b) FSM for reset and clock control (c) wave diagram of the operation (Source: vSync Circuits)

Part 3 of this series discusses some useful special cases.

References

  1. G. Wirth, F. L. Kastensmidt and I. Ribeiro, “Single Event Transients in Logic Circuits – Load and Propagation Induced Pulse Broadening,” IEEE Transactions on Nuclear Science, 55(6), 2928 – 2935, 2008.

  2. C. E. Cummings, D. Mills and S. Golson, Asynchronous & Synchronous Reset Design Techniques – Part Deux, SNUG, 2003.

  3. W. J. Dally and J. W. Poulton, Digital System, Engineering (Eds.). Cambridge University Press (1998).

  4. C. Dike and E. Burton, “Miller and noise effects in a synchronizing flip-flop,” IEEE Journal of Solid-State Circuits, 34(6), 849-855, 1999.

  5. vSync Circuits Vincent Platform, http://vsyncc.com/products

  6. Altera, Quartus-II, www.altera.com

  7. Quartus II Handbook Volume 1: Design and Synthesis, pp. 11-19 – 11-29, 2014.12.15

  8. K. Chapman, “Get Smart About Reset: Think Local, Not Global”, Xilinx, WP272 (v1.0.1), 2008.

  9. K. Chapman, “Get your Priorities Right – Make your Design Up to 50% Smaller,” WP275 (v1.0.1), 2007.

  10. K. Chapman, “Xilinx-Ken Chapman-That Dangerous Asynchronous Reset!-External Antenna – Need for de-bouncer”, PLD Blog, 2008.

  11. Xilinx, XST User Guide for Virtex-6, Spartan-6, and 7 Series Devices, UG687 (v 13.1), pp. 50, 95, 128, 2011.

  12. Yaniv Halmut, RESET architecture in Altera FPGAs: utilization effects, private communication, RAD, 2016.

  13. Chris Kwok, Priya Viswanathan and Ping Yeung, “Addressing the Challenges of Reset Verification in SoC Designs”, DVCon, 2015.


Rostislav (Reuven) Dobkin received PhD degree in electrical engineering from Technion, Israel Institute of Technology. Reuven is a co-founder and CTO of vSync Circuits LTD. (2010), a VLSI CAD company. In parallel, Reuven serves as a lecturer in Technion. Reuven has held management positions in radiation-hardened VLSI technology for space applications, in communications chip development, and in research in C4 I systems, signal processing, software systems engineering and VLSI. Reuven serves as a reviewer of numerous VLSI journals and conferences. His research interests are VLSI architectures, asynchronous logic, synchronization, GALS systems, SoC, NoC, many-core processors and parallel architectures.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.