Reducing tester-based silicon debug effort & time: Part 2 – A check-list of best practices - Embedded.com

Reducing tester-based silicon debug effort & time: Part 2 – A check-list of best practices

Editor's note: In the second of a two part series on reducing tester-based silicon debug effort and time, the authors provide a detailed check list of must-do practices to follow during verification for testing (VFT).

Considering that testing to ensure that a circuit is fully functional can cost as much as several millions of dollars, it is absolutely necessary at the verification for test stage (VFT) to do the maximum possible at that point – much before even the silicon is out – to ensure high probability of functionally alive silicon in the quickest possible time and at the lowest cost.

This second part in the series focuses on rules derived from our experience testing and debugging SoC designs that can be applied while creating the tester-specific testbench, generating the patterns, and simulating the suite.

Using these practice-derived rules can reduce the functional pattern bring-up time on the tester and reduce the chances of failure due to pattern mismatch issues. More importantly, following these rules will reduce the total debug time and efforts used to resolve observed failures and issues in the tester environment.

Crucial VFT design and test practices
Must-do practice #1
While creating the tester patterns, the design pads/ports used during simulation should be restricted to the pads that are available across all modes of testing, particularly the minimum set of ports which are available across all packages: communication between the core component of testcase (the .c/.h) and the testbench side (verilog/system_verilog component), or for any data mail-boxing or port toggling to highlight execution stages.

Issue Analysis The above practice seems like a basic step but is often missed due to full suite of pads/ports being available at the time of verification simulations of the tester pattern. The problem is that issues emerge once they are ported to be run on tester requiring extensive and tedious debugging in case of unsuccessful communication, data mailboxing, and/or port toggling, thus leading to unexpected pattern behaviors occurring on the production silicon.

Solution The verification engineer always needs to check the Test Pin Muxing sheet available from design-for-testing team to know what pads are available across all the modes of testing and from that create a tester specific mode of the testbench which only has those restricted set of pads available for use for flags, mailboxes etc.

Must-Do practice #2
In the tester pattern environment at verification level, there should not be any back-door loading of any memory location.

Issue Analysis Backdoor loading of memories in VFT environment can mask many potential issues that will ultimately emerge due to uninitialized memory being accessed in design once the pattern is run on tester.

System RAM or any other memories being used by the pattern may be getting initialized at zero-time through backdoor loading, as a legacy from normal simulation pattern environment (where this is often done to ensure no corruption in pattern execution).

But if a read happens on this uninitialized location of memory on tester (due to burst access even if the downloaded code doesn’t write to these locations), ECC (error correcting code) will be generated in the silicon. This will cause pattern failures that produce unpredictable intermittent behaviors due to randomness of the ECC when invoked at uninitialized locations. This makes it very difficult to debug the real issue since this will never be evident in simulation environment due to backdoor loading.

Solution Never initialize the memory through backdoor loading at zero-time in the testbench. This will help to catch errors due to accesses of uninitialized memory as a result of burst access or problems with code jumps and similar situations in the simulation stage. When a porthole address needs to be initialized beforehand, it must be done through the startup CRT (constrained randomized testing) code itself.

Must-do practice #3
For tester patterns, the start address for code (where core jumps after the initialization code execution) should always be aligned according to width of the instruction bus fetch.

Issue Analysis For example, say the instruction bus for the SoC core reads the data in 64-bit aligned format. But suppose that the infrastructure for tester patterns is such that downloaded code starts at an address which is not 64-bit aligned but rather 32-bit aligned (i.e. 0x40000104) and standard initial code initializes 256 bytes of the memory.

The result is that the location 0x40000100 has some random value since it is not initialized and was not written to with the downloaded code. When core jumps to memory for code execution, it reads data in 64-bit alignment. Since data on 0x40000100 is instead some random value, a read can generate a multi-bit ECC error leading to an exception message sent to the core, which then gets hung up.

Since the probability of the ECC error getting generated depends on the random nature of the uninitialized location data as well as a randomized ECC as well, it should come as no surprise that there will be an 1/10 (inconsistent) passing result on the tester, making debugging even more difficult.

Solution Always keep the start address of the downloaded code at a location that is aligned according to the width of the instruction bus.

Must-do practice #4
If cache is enabled in the initial code and the core initiates burst operations for fetching the data, steps should be take to ensure that the start address is aligned to the total width of the burst transactions.

Issue Analysis For example, say the core makes a 4-beat wrapping burst read for every burst operation, with cache enabled with each read of 64 bits. If the start address is not 256-bit aligned, we'll again end up in some ECC errors while reading uninitialized memory in the first burst fetch due to the problems discussed in Must-do practice #3.

Solution The start address must take care of the burst operation of the core and the enablement of its cache and make the start address alignment accordingly.

Must-do practice #5
The porthole base memory location must be initialized through the startup constrained random testing (CRT) code.

Issue Analysis We use portholes for information printing in the patterns. Specific system memory locations are to be used for this. This includes an 8-bit write in the porthole location. For some core-platform systems, there is always a Read-Modify-Write (RMW) for 8-16 bit writes to System RAM. But since the read part of the RMW is 32-bit, we will end up with ECC errors while writing 8-bit if we do not initialize these porthole base locations before handing through the startup CRT code.

Solution The porthole base memory location must be initialized through the startup CRT code. Generally these information prints are only required for debugging and development. As a result, the testbench must have provision for a switch that can be passed specifically for tester patterns, which will suppress the printouts but still allow use of the macros (such as for register access). This will eliminate the problem altogether.

Must-do practice #6
The linker files used for making the hex code to be loaded and executed should be such that the hex does not have any address holes in it.

Issue Analysis In case of holes, that portion of the memory would remain uninitialized/unwritten with the result that there is a high probability of code execution jumping to these empty locations and getting accesses (due to burst accesses), thus leading to exceptions and ECC errors.

Solution Ensure that the testbench infrastructure for tester patterns at simulation level does not leave any holes in the generated hex.

Must-do practice #7
Always use zero-padding or fillers at the end of the hex so that no un-initialized locations are accessed during executions involving long instruction pipelines, or instruction bus fetch width constraints, or burst accesses.

Issue Analysis PowerPC cores have long instruction pipelines, so they will prefetch some additional data while fetching one instruction. Also the core instruction bus will always fetch a minimum width read and may have burst accesses enabled. So if the end locations of the hex are not aligned, during fetching of some instructions the core may end up fetching data from uninitialized locations with random ECC values even though it is fetching a valid instruction. This may result in ECC errors which may cause the core to get exception flags and stop executing altogether.

Solution  During the linker and make process, care should be taken to put some filler codes at the end of hex in such a way that whatever code it puts in memory will be aligned so that un-initialized memory location accesses occur because of the following: long pipelined accesses, burst accesses or minimum width of bus instruction fetches. This can be done by using scripts to do a zero-padding at the end of the hex.

Must-do practice #8
While doing Resim (resimulation), loadings should be kept to a minimum in the VFT environment. This can be done through the use of earlier functional gate level simulations (GLS) replicated in the resim testbench.

Issue Analysis In the case of some analog models and memories, some of these behavior need to be forced in the functional testbench, for example forcing x-generation logic, initialization of portholes memories, or programming test rows of flash. Currently in the resim environment the functional testbench isn’t ported as is. But the same design is used to create the VCD, so X-corruption can occur there too.

Solution  Before starting Resim, such things should be replicated for Resim from VFT environment. Also, all necessary forcing or loading needs to be replicated from functional testbench to the resim environment.Must-do practice #9
Care must be taken to ensure that no valid pad can be designated x/z for any portion of the VCD for the tester pattern. Also all unused pads must be masked off in the VCD.

Issue Analysis For patterns with analog behaviors and analog pads being tested, the unused pads should be masked in the resimulation environment. If they are not, they get looped back and thus a high-z state occurs on these pads for some time duration due to delay in do-> pad path. This leads to x corruption in design.

Also all valid pads being probed or driven in the VCD must never be allowed to be x/z for any portion of time in the VCD since that again can cause the same issue as above.

Solution Automatic script analysis must be enabled to find out all pads with x/z state, and either mask them off if they are unused or drive them with safe value when not driven actively.

Other important VFT practices
Some other practices that ensure smooth porting and direct mapping of the simulation behavior of the VCD/patterns with the behavior observed on tester include:

  • Provide the maximum possible time for analog modules to get stable and provide their status (for example, the status from the voltage regulator for Power-on-reset de-assertion) and also for the maximum possible scanning time for initialization data from on-chip flash, etc., at startup should be considered. Accordingly, you should delay programming in the VCD, no matter what time factor is visible during simulation due to usage of behavioral models.
  • Make sure the freezing of the main clock (for example, before taking system to reset) and restarting (once we are out of reset) of the main clock on which the VCD is based are done in exact multiples of the cycle period, never on a half clock period , (i.e., the time from the positive edge of clock before freezing), and on the positive edge of clock on restarting using an integral multiple of the clock period. This latter maintains cycle synchronization which aids smother operation on tester (since cycle based nature needs to be maintained).
  • To finish off the testcase execution in a synchronized manner, in case the pattern expects some return value (to signal pass/fail) from the core side code sent out on the same pads as those used for downloading the code, probed by the verilog component. Then this probing needs to be enabled only after the download has finished since otherwise random code-data being downloaded can trigger off a false pass/fail and lead to a testcase finish before actual code execution even starts. 
  • Test mode entry schemes and their startup and shutdown sequences must be developed such that the VCDs are compliant for running back-to-back on tester. Doing this will enable faster execution on the tester by trimming the start and end of the VCDs and joining them end-to-end, running one after the other, without needing a power-off/on in between.
  • If the design is such that the peripheral clocks are disabled by default during and after power-up, then for each test-case you should start with a clock enabling macro to enable the clock of each module. Otherwise, the pattern may get stuck while accessing register of modules whose clocks are not even enabled. 
  • VCD should not have any internal signal forcing/probing or any hard delays.
  • As far as possible, the cache should not be enabled by default since it can cause unpredictable failures, making debugging difficult. 
  • As far as possible and as allowed by memory constraints, hex patterns should always be loaded into the backup-system RAM (which is active even in low-power modes) so that all patterns can be tuned to run into low-power modes if required.
  • The source code for the tester should be highly optimized to reduce memory and time during download. 
  • The pads being used to signal some output on tester should always have their slew rates programmed to values that ensure their fastest behavior. Otherwise you may end up in a loop, debugging some unnecessary slow pad issue.
  • For a signature on Tester cases, no pad should be toggling at the frequency of the input clock of the pattern. In other words , the input clock should be the fastest clock in the VCD (apart from the PLL /DDR output clocks) to ensure proper sampling of all transitions.
  • INFO/DEBUG statements should never be used in the Tester patterns. Their use unnecessarily increases the code length. Their use also leads to porthole issues. Not only does this put a strain on the already limited memory on the tester and lead to unnecessary debug ( such as in case the porthole locations), the scheme doesn’t work reliably when implemented in silicon.

A stitch in time saves nine
These best-practice recommendations are the result of insights gained by generating the VFT testbench infrastructure, creating and simulating the tester patterns, and debugging the failures observed in simulation as well as on silicon.

Through the use of such “learning from experience” procedures, we've saved unnecessary iterations of simulation pass/silicon failure pattern/infrastructure modification for most of the patterns. Instead, they are caught early on in simulation itself and result in the delivery of pattern VCDs which are correct by construction.

This has also reduced functional tester pattern bring-up time, as well as tester time and debug effort. This combination ultimately brings down the overall cost of chip design and test but at the same time results in higher quality performance at customer end. We consider these techniques generic and capable of ensuring an improvement in overall efficiency of VFT engineers.

Part 1: Testing modes

Neha Srivastava is a lead design engineer at Freescale Semiconductors (Noida, India Design Centre), working in the Automotive and Industrial Solution Group (AISG) for over 5 years. She has a Bachelor of Engineering (B.E.) degree from Birla Institute of Technology and has worked on multiple SoCs in front-end verification and Verification for Testing domain with the areas of interest being low power designs, safety architectures, and high performance systems. She can be reached at .

Aashish Mittal is a principle design engineer at Freescale Semiconductors (Noida, India Design Centre), working in the Automotive and Industrial Solution Group(AISG) for over 12 years. He has a Master of Technology from Banaras Hindu University and has worked on multiple SoCs in front-end verification, Testbench Integration and Verification for Testing domain with the areas of interest being dual core, security, debug and low power architecture. He can be reached at .

Nitin Goel is senior design engineer at Freescale Semiconductor India Pvt. Ltd , working in the Automotive Microcontroller Group (AMCG) for over 6 years . He graduated from Netaji Subhash Institute Of Technology, Delhi in 2006. Since, then he has been working in Frontend verification domain. Along with experience in IP Level, SoC Level, and Core verification , he has been working in Tester pattern development and debugging.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.