Dealing with structural and reset faults in embedded SoC designs - Part 2 - Embedded.com

Dealing with structural and reset faults in embedded SoC designs – Part 2

Combinatorial logic in reset path
Using combinational logic in the reset path may produce glitches if the inputs of the combinatorial logic change at about the same time, triggering a false reset in design. Below is the kind of register transfer level (RTL) code which will produce such an unintentional reset.

assign module_a_rstb = !( (slave_addr[7:0] == 8’h02 & write_enable & (wdata[7:0] == 00) )
always @(posedge clk or negedge module_rst_b)
   if(!module_rst_b) data_q <= 1’b0 ;
   else data_q <= data_d ;

In the above example, slave_addr, write_enable and wdata change their w.r.t system clock value. Using static timing analysis, the designer can ensure the stability of these signal within one clock cycle before the setup time window of the destination flop.

However, in this example these signals are used as the asynchronous clear input of a flop. Logically at any particular time the slave_addr[7:0] is changing its value from 00000110 to 01100000. But due to propagation delay (net delay and cell delay) of the combinatorial logic, it can make a transition with a sequence of 00000110: –> 00000010 –> 00000000 –> 01000000 –> 01100000.

Figure 6: Combo logic in reset path

If the wdata[7:0] is already zero and “write_enable” is already asserted during the time the salve_addr was 00000010 then it will create a unwanted pulse at module_rst_b , causing a false reset (Figure 6 ).

Solution: The solution to this problem is to register the combinatorial output before using it as a source of reset (Figure 7 ).

Figure 7: Combo logic in reset path Solution

Sometimes the solution shown in Figure 7 is not enough, if the inputs of the combinatorial logic change around the same time, triggering a false reset in the design. Figure 8 illustrates how this kind a problem can arise in such a design.

However, if the changes in the input signals of the combinatorial logic are mutually exclusive, then it may not cause any problems. For example, test mode and functional mode are mutually exclusive. Hence the test mux in the reset path is a valid design practice. But in some cases static signals or signals whose changes are mutually exclusive can cause a false reset trigger in a design.

Figure 8: Combo logic in reset path

In the example in Figure 8 , a MUX structure is used in reset path while coding in RTL. Here 'mode' is a control signal that doesn’t change frequently, and mode0_rst_b and mode_1_rst_b are two reset events. However, while the RTL was synthesized, at the gate level it is broken into different complex combinational and-or-invert [AOI] cells. Logically this is equivalent to a mux. But due to different cell and net delays, final_rst_b produces a glitch whenever there is a change of the signal mode from 1–>0 .

Solution: Because MUX structures are less prone to glitches than other combinatorial logic, the best way to solve this problem is to preserve the MUX structure in the reset path during synthesis. A MUX pragma (data embedded in the RTL code to indicate some intention to the compiler) can be used while coding RTL to help the synthesis tool preserve any MUXes in reset path.

Issues with synchronous reset in design
SoC designers often prefer to reset the design synchronously with respect to the global clock. One reason to do this is to save some die area (flops with asynchronous reset inputs are bigger than non resettable flops). Another is to keep the system completely synchronous with respect to clock.

For this type of design it is important to provide clock signals to the flops when the reset source is asserted. Otherwise these flops may remain un-initialized for some duration. But when this module is plugged into a system, the system designer may opt to disable its clock if the module does not need to be active at the beginning during reset phase, which will save dynamic power used by overall system. In such cases, the module will remain un-initialized for some duration even after reset de-assertion. If any output of the module is used in system directly then the un-initialized and unknown value(X) will get captured which can cause functional failure of the system (Figure 9 ).

Figure 9: Sync reset issue timing diagram

Solution:  One way to deal with this situation is to enable the clock of the module during reset phase for a minimum time, such that all flops inside the module get initialized during reset. There will not be any un-initialized value at module output when system reset gets de-asserted (Figure 10 ).

Figure 10: Enabling the clock of the module during reset phase

A more difficult problem in this category is when two flop synchronizers are used in clock domain crossing path – a common practice. However sometime designers use synchronous resets for those flops, using RTL code similar to that below:

always @(posedge clk )
  if(!sync_rst_b) begin
    sync1 <= 1’b0 ; sync2 <= 1’b0 ;
  end
else begin
  sync1 <= async_in ; sync2 <= sync1
end

After RTL synthesis using the above code, hardware will be generated (Figure 11 ) that introduces combinatorial logic in the sync chain of the two-flop synchronizer. While this introduces risk, it provides less time for the metastability to settle at the input of sync2 flop.

Figure 11: Introducing combo logic into the sync chain

Solution: To avoid combinatorial logic in sync chain, the RTL code can be written as follows:

always @(posedge clk )
  if(!sync_rst_b) begin
    sync1 <= 1’b0 ;
  end
else begin
  sync1 <= async_in ; sync2 <= sync1
end

No reset is used for sync2 flop in the above code, hence no combinatorial cell will be implemented in sync chain. However, sync2 will take one extra cycle to get reset, but this should not cause any problems in a typical design.

Redundant reset synchronization
In an SoC circuit wheremultiple asynchronous clocks are used, the designer needs to ensuresynchronous de-assertion of the asynchronous reset with respect to theclock used by the destination register. Otherwise it can cause timingviolations at the destination flop, introducing metastability.

Resetsynchronizers are used to make the reset de-assertion synchronous todestination clock domain, such that reset de-assertion timing violationscan only happen if the destination clock is present during the time ofsystem reset de-assertion (Figure 12 ).

If the clock isabsent at the time of reset de-assertion then there won’t be any timingviolation. So when designing a multi-clock domain module the designerneeds to keep compile time options set to bypass those resetsynchronizers and allow the system integrator to decide if a resetsynchronizer needs to be used, based on the clocks available to themodule.

However, redundant synchronizers can create functionalproblems in design in those cases where the system clock to asynchronousclock ratio is very high.

As shown in Figure 12 , the system reset whose de-assertion is synchronous to sys clk has been fed into a reset synchronizer (of mod_clk domain) before using the reset in mod_clk domain logic, in this case with a clock frequency ratio of sys clk : mod_clk > 6 : 1 or so.

By default mod_clk is not enabled in order to save dynamic power. When the designer wants to enable the functionality of mod_clk domain logic the clock is enabled, introducing a latency of two mod_clk cycles. In such cases, the whole mod_clk domain logic is in reset state because of the resetsynchronizer. During this period, if some data transaction is startedfrom sys clk domain, this will be lost in the mod_clk domain.

Figure 12: Issue with redundant synchronizer

In Figure 12 , the system reset whose de-assertion is synchronous to sys clk is fed to a reset synchronizer (of mod_clk domain) before using the reset in mod_clk domain logic. Let’s assume the clock frequency ratio of sys clk : mod_clk is greater than 6 : 1 or so. By default, mod_clk is not enabled to save dynamic power.

Whenthe system designer wants to enable the functionality of mod_clk domainlogic, the clock is made active. Once the clock is thus enabled thereis a latency of two mod_clk cycles, where the whole mod_clk domain logicis in reset state because of the reset synchronizer. During thisperiod, if some data transaction is started from sys clk domain thiswill not be visible in mod_clk domain.

Solution:  While itdoes not affect the operation of the SoC, this latency is not visible tothe developer , causing some confusion when it is time to integrate theSoC into the overall system. So it is advisable to remove suchconfusion by doing the following:

  • Bypass /remove redundant reset synchronizers in the design if clock is not present during global reset de-assertion. This will save some gate count.
  • Enable mod_clk at startup code before the mod_clk domain logic starts its operation If dynamic power dissipation is not a concern. This will allow the reset de-assertion enough time to propagate.
  • Handle the problem in software too by inserting a two-three mod_clk cycle delay once the mod_clk is enabled before any valid operation.

Reset de-assertion timing due to uncommon clock paths
Theappropriate SoC reset architecture varies from system to system. Insome safety critical devices the complete reset state machine isrequired to use a safe clock, which is enabled by default. The sameclock is also used as default system clock.

Figure 13: Issue with un-common clock path

In Figure 13 the reset state machine (the flop R) is working on default_clk . Also during reset de-assertion default_clk is the source of sys clk . So logically both the clocks (clk1 and clk2 ) are synchronous during reset de-assertion. But due to a huge uncommon path between clk1 and clk2 it is very difficult to balance these two clocks and treat them asoperating synchronously. Hence it becomes challenging to meet resetde-assertion timing of flop A.

Solution: The best way to deal with such situations is to treat clk1 and clk2 as operating asynchronously and initiate the reset synchronizer beforeusing the reset in flop A. This makes it possible for the resetde-assertion timing to be met from S2–> A (Figure 14 ).

Figure 14: Solution

Conclusion
Inthis article we have focused on faults in reset design and a few of thepossible solutions, although these solutions may not work for alldesigns. They address broad categories and present some genericsolutions. So, the guidelines proposed may require some modification tofit particular situations.

Read Part 1

Arjun Pal Chowdhury is LeadDesign Engineer at Freescale Semiconductor. He has been working withFreescale and has 7 years of experience in SoC Design and Architectureand is involved in designing chips which goes into Automotive as well asIndustrial and Multimedia Market.

Neha Agarwal isSenior Design Engineer at Freescale Semiconductor. She has been workingwith Freescale from last 3 years in SoC Design and Architecture and isinvolved in designing chips which goes into Automotive Market. Graduatedfrom Birla Institute of Technology, Mesra in year 2009.

References
1. Asynchronous & synchronous reset design techniques by Clifford E. Cummings,Don Mills,Steve Golson.
2. Preventing Structural Faults in Design: Clock and Resets by Neha A, and Arjun Pal C., Tech OnLine.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.