Accelerating server-based system architecture compliance testing through emulation -

Accelerating server-based system architecture compliance testing through emulation

System-level verification objectives raise a concern that validation might only be possible after the design and drivers are mostly fully assembled and debugged but waiting for the design to be finished before checking compliance isn’t helpful either.

System-level verification objectives raise a concern that validation might only be possible after the design and drivers are mostly fully assembled and debugged but waiting for the design to be finished before checking compliance isn’t helpful either.

Paul Cunningham (SVP/GM of System Verification Group at Cadence) recently wrote an article on how advanced server design is creating new challenges for successful system bring-up and how Cadence has been working with Arm to accelerate this process, in line with the Arm SystemReady program. Here, I’d like to share a little more detail on how, for our own hardware development, we have been able to get system-level test suites up and running within half a day.

New Challenges in Validating Server Platforms

click for full size image

New System-Level Verification Needs. (Source: Cadence Design Systems)

There are plenty of challenges with validating any large SoC, but two stand out as unique to this domain. The first is ensuring SystemReady compliance. SystemReady is a program established by Arm to ensure out-of-the-box compatibility for mainstream OSes and started out targeting Arm-based servers (with ServerReady) and expanded into other devices and markets including edge servers and embedded/IoT. . Meeting this objective is a silicon designer’s responsibility since significant elements of compatibility depend on how Arm cores are integrated in the SoC. To simplify and better ensure system compatibility for integrators, Arm developed the SystemReady package which defines a set of checks to run on the UEFI layer, which integrators must pass to meet this goal.

The second challenge is to prove PCI Express (PCIe) compatibility for the integration. PCIe plays a central role in servers for communication with now much smarter peripherals, for inter-chiplet communication and for remote boot. We tend to think of testing around I/O blocks as protocol-centric, but two factors push a need for system-level testing. First, under the loose ordering memory model in Arm-based designs it is possible to enhance performance under certain workloads. A tradeoff is that integrators can create in their design a possibility for deadlocks in PCIe packet processing under high-traffic conditions. Second, PCIe has evolved considerably from 3.0 onwards to support system-level optimizations such as support for TLP Processing Hints (TPH) and address translation services (ATS). These features demand broader system-level validation.

Such system-level verification objectives raise a concern that validation might only be possible after the design and drivers are mostly fully assembled and debugged. How else, for example, could SystemReady tests running on top of the UEFI or traffic-dependent tests be exercised? But waiting for the design to be finished before checking compliance isn’t helpful either.

The System VIP Approach

Verification IPs (VIPs) are commonly used to generate external traffic or as a fast replacement for an IP to accelerate verification cycles. Comprehensively tested against well-defined standards, VIPs act as standard-compliant elements in component-level roles. System verification can benefit from similar concepts but with a different spin. Take, for example, coherency checking as a system-level verification objective. Though not directly related to our SystemReady and PCIe deadlock objectives, coherency checking is a common component of SoC testplans and is a good introduction to the System VIP concept.

click for full size image

System VIP Applied to Coherency Testing. (Source: Cadence Design Systems)

In this application, a System VIP will draw on a combination of components—traffic libraries, use cases, a Portable Stimulus Standard (PSS) model and a scoreboard. Some of these will be pre-proven and some developed by the SoC verification team for their unique objectives. The VIP principle behind this methodology is that system verification development should not require reinventing what is already defined by standards or best practices. Also important, it should be usable in guiding design refinement before the full system design is complete.

Unlike a conventional VIP, System VIP isn’t one component but instead a library of application-specific test components, which must be integrated together to build a system-level testbench (components may also include conventional VIP.) The primary goal that all these components and testbench integration should support is fast generation of application-targeted tests that are ready to apply to an SoC design with minimal or no testbench debug. Components include traffic libraries, a performance analyzer, a verification score board and—pulling it all together—a testbench generator.

System VIP Testbench Construction

Traffic libraries, which may include coherency, performance and PCIe libraries among many others, encapsulate content to enable the rapid creation of scenarios built on top of the Accellera PSS. For example, a coherency library would contain a set of packaged use-cases that can be applied to coherent systems, such as false sharing scenarios using a configurable number of threads to deliver a complex coherency test case.

Building a test around such libraries requires configuration to the target SoC design. We have found it easiest to work with a methodology (System VIP) in which we can choose a use case, drag and drop it to a PSS canvas, then solve it. With PSS, we can reuse such a test on any SoC with any number of cores. Tests can use multiple System Traffic Libraries, such as the PCIe Library to create I/O-coherent system tests.

Another important component is system performance analysis, for visualizing and analyzing the performance at key points across a typical SoC. On-chip bus performance provides valuable insight into the operation of the SoC. Typically, high-stress scenarios are used to explore the limits of system performance, looking for out-of-balance bandwidth sharing or deadlocks, for example.

A key component of a System VIP must be application-specific system-level checking and scoring. We offer a scoreboard that has plug-ins for a range of scenarios, covering on-chip buses and DDR memory models that enable it to track transactions as they traverse the entire SoC.

click for full size image

Example Generated Testbench Architecture. (Source: Cadence Design Systems)

Finally, these components must be assembled, together with the device under test (DUT), into a testbench. For our purposes, using comma-separated variables (CSV) or IP-XACT plus topology details, the generator creates either System Verilog testbenches in UVM for simulation or C-based testbenches for emulation. The testbenches are composed using VIP or Accelerated VIP (AVIP) for the various interfaces based on the target execution engine.

Application to SystemReady and PCIe Verification

As mentioned earlier, the Arm SystemReady test suite runs on top of the UEFI layer. We worked with Arm to define a map of these tests down to a bare metal (BM) abstraction layer, which was then captured in a System VIP SystemReady library. Using this map, we are now able to run the complete suite of 120 self-checking tests in minutes on our emulation platform. This path speeds up compliance testing and debug in development so effectively that we use these tests ourselves in validating new generations of our emulation hardware.

click for full size image

BSA+SBSA Pre-Silicon Certification. (Source: Cadence Design Systems)

For PCIe testing, we recommend first that users run the SystemReady test library (which includes basic PCIe tests) to make sure they are ready to run more comprehensive tests. After that passes, the next aspect of testing is for potential deadlocks. PCIe is a packet-based protocol in which a packet may be split into multiple transactions. Some others require a strict ordering for transactions to make sure that they are reassembled correctly at the requestor. However, systems supporting a loose ordering can offer higher performance in some cases, but also run the risk of deadlocks in high-traffic situations where a requestor may be waiting for a packet and find itself effectively blocked.

As we move to multi-chiplet design, latencies to memory can jump significantly. Here, not just from a core MMU through a coherent interconnect to memory, but now from the MMU to coherent interconnect to memory mapped on another chiplet. Delays increase and the potential for deadlocks may again become non-negligible. System designers must detect and protect against this possibility.

In addition, from PCIe 3.0 onwards, the standard has added support for system-level services. One example is TLP processing hints (TPHs), which add hints from an endpoint sourcing a transaction on how the target processor should handle that transaction for throughput optimization. One such hint can accelerate processor performance by suggesting direct writing a transaction to a cache line for a processor expected to soon need that data (cache stashing), without need to write to backing store.

Another optimization supports remote distribution of address translation services (ATSs). As endpoints become more intelligent, virtual/physical address translation can be offloaded to those endpoints, reducing the processing load on the translation in the system MMU. Here, validation would want to test for correct address translation, as well as correct invalidation and update address translation caches in those endpoints.

Validating correct PCIe behavior at the system level—against deadlocks, with TPHs, for ATS and for other system-level PCIe services—obviously requires much more than protocol checks. In the System VIP traffic library, we have components to model endpoints, from which we can drag and drop endpoint-to-memory tests. These will drive traffic from outside the chip, into memory and back again. They can be used for performance testing and more generally simply to test the correct operation of PCIe and related paths. Similarly, we have examples of how to build TPH and ATS scenarios and how to very quickly create tests to validate these use models. These features together leverage traffic libraries and generation, score-boarding and performance testing to quickly build system-level PCI tests.


Historically, system-level validation has been viewed as the endpoint of verification, certainly leveraging libraries and some reusable components, but not in itself reusable. However, we now find that these system-level methods are essential to keep verification and validation in step with the advanced design techniques emerging in server architecture, in wireless infrastructure, automotive and many more applications. We have found that the System VIP concept can be an important accelerator for building tests of this type by simplifying construction of common tests through configurable generators.

Nick Heaton is Distinguished Engineer at Cadence Design Systems with specific responsibility for SoC Verification and is the architect of the SystemVIP product range. He graduated with First Class Honours in Special Engineering from Brunel University and has worked in SoC Design and Verification for more than 35 years both from the SoC developer side and the EDA side.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.