Optimizing pre-silicon software development - Embedded.com

Optimizing pre-silicon software development

In today’s fast-moving technology era, the most common approach for handling the needs of the market is a system-on-chip (SoC). An SoC is basically a processor surrounded by function accelerators and lots of I/O for the associated peripherals it supports. Since the mobile data revolution in 2002, it has become a prerequisite to use SoCs to facilitate the key features that define a smartphone. In the same way, SoCs have since become the go-to device for creating “smart” consumer products like TVs, cars, and the ever expanding Internet-of-Things (IoT) market.

The growing demand for SoCs has created a highly competitive market. Because of this, SoCs are getting more complex, the peripherals in the SoCs are continuously evolving, and the time to market is shrinking. A crucial component to match the complexities of SoC development is the availability of software. There is little room for mistakes, and software must be ready as soon as possible. To meet this challenge, software development must be initiated before the availability of the SoC part.

SoC software development

Traditionally, software development would begin after the first silicon sample arrived from manufacturing. When SoC samples arrived, software and validation teams would start their development activities, and a big SoC bring-up effort would kick off. Teams working on the SoC would converge from around the globe to be under one roof for a limited amount of time to support the SoC bring-up.

Software development generally took months after the first sample arrived before it was ready for production. Meanwhile, silicon validation would complete, which would give a limited amount of confidence to initiate mass production for the associated products.

Due to the increasing complexity in the SoC design, however, what normally would take months of software development could now stretch into years before software was ready for production. The increasing number of supported peripherals and evolution of those peripherals also created gaps in subject matter expertise. Software teams would be required to fill in those particular gaps by resourcing new developers with expertise in those domains (audio, video, USB, Ethernet, etc).

To be able to deliver production ready software early in the project, software development can’t wait until the first sample of silicon is available. A shift-left approach needs to be taken where software development starts as early as possible and, even better, at the same time as the SoC hardware design begins. Pre-silicon software development may also help identify SoC implementation bugs and potentially reduce the cost of metal fix or full mask tapeouts. Several methodologies are considered to meet these requirements.

Pre-silicon development approaches

To start software development before SoC tapeout, developers can use a few approaches such as software prototyping, RTL testbench, FPGA boards, hardware emulators, etc. Since these approaches typically focus on individual modules, each of these approaches has its own challenges since the objective is to develop software for bringup of the whole SOC rather than individual modules. If we break the problem into smaller modules, the first thing needed before driver development can begin is knowledge of each processor, accelerator or peripheral under development.

System C models

C behavioral models can be built for each IP of the SoC, and standalone software drivers can be tested on these behavioral models. But this approach has a couple of problems. First, there is a huge software effort required, which means a large software team or a dedicated model team is needed to support implementation of the model itself. Hence the development of models would not be cost effective. Second, the accuracy of the behavioral models depends on the interpretation of the developer. Any communication gaps between the IP design owner and the model developer can result in inaccurate behavior. This results in a lot of wasted effort to fix issues associated with the misinterpretation of the design.

RTL testbench

To address this inaccuracy issue, another approach that could be taken is to use a Verilog testbench. The testbench is typically developed and maintained by the SoC design team for verification. The Verilog testbench is based on the register transfer language (RTL) specification of the SoC, representing the full SoC, not just some IP blocks. Consequently, it is cycle-to-cycle accurate. As the RTL develops, the testbench moves in lockstep with it. This ensures that it’s the most up-to-date and accurate representation of the SoC as it’s under development. For software development purposes, the Verilog testbench can also be used to develop software drivers.

Software developed using this method is accurate and can help reduce software bring-up time when SoC samples arrive after the fabrication process. But there is a problem with this approach. Because the Verilog testbench is cycle accurate, it is very slow. Developing software in such an environment is possible, but it’s going to be extremely slow to develop and debug. It may take months to develop a driver with this methodology. Verilog testbench may be usable by starting much earlier – essentially increasing the amount of time needed in pre-silicon to account for the slow speed of the solution (but depends on availability of testbench). In an alternative approach, another software team can use this methodology (only working on pre-silicon development) – essentially increasing the number of resources needed, thus not removing this problem similar to the problem with the C behavioral model method.

In practice, we cannot accept inaccurate or long development cycles, nor can we accept the additional costs needed to duplicate or increase the number of resources to keep a normal design cycle timeline. Consequently, we have to consider another approach to pre-silicon software development. This approach would involve emulation of each SoC IP block on a field-programmable gate array (FPGA).

FPGA prototypes

Modern FPGAs are fairly fast, and since the FPGAs are built from RTL, they are cycle-to-cycle accurate. With increasing design complexity, IP blocks have a lot more gates than years back. Years ago, FPGAs were limited by the number of ASIC gates, which meant it wasn’t possible to fit bigger logic blocks into a single FPGA. Now it’s possible to build an FPGA for each block and develop a driver on them that’s fast and accurate.

This methodology is faster and doesn’t require software teams to resource their time early. Because it works with each separate IP block rather than the entire integrated SoC design, this approach limits the software from doing full SoC-level development. It omits integration details of how various IP blocks function together. Hence, although this method will lower the bring-up effort, gaps still exist since it misses pertinent SoC integration details. This method could be an acceptable approach for derivative SoCs, which have a limited number of changes, but doesn’t have desired full coverage required for SoC software development.

click for larger image

Figure. Synopsis of pre-silicon software development solutions. (Source: Nitin Garg)

SoC emulators

To address the issue of accuracy, speed, and coverage, a more robust approach could be taken, which is using SoC emulators. There are a lot of commercially available SoC emulators, which can emulate very large and complex SoCs. SoC emulators are based on RTL, so they are accurate, and they are 100 times faster than Verilog testbench, making it much better for software development. Since they are fairly fast, full OS porting and driver development can be performed in a reasonable amount of time. SoC emulators can scale the whole SoC, so software development is better adapted to the final production SoC.

Using SoC emulators for pre-silicon software development and design reduces the software bring-up time and effort since it can eliminate or reduce overall development gaps. Software can also be debugged using standard JTAG tools on a SoC emulator. Emulators can be used for multiple tasks like ROM development and verification, firmware and OS development, and IP or SoC-level verification. Another interesting feature of SoC emulators is that they can interface the SoC to real components like those featured on a development board. For instance, it’s possible to connect a real or virtual NAND device to the SoC in an emulator and develop ROM, OS drivers, and the like.

SoC emulators offer far more possibilities than other software development approaches. Emulators can interface the SoC simultaneously to UART, I2C, various displays, storage devices, PCIe devices, connectivity devices like Ethernet and Wi-Fi and capture devices like cameras and sensors. In other words, SoC emulators can represent an actual development board, so one can bring-up a complete framework like Android, and run a complete use case before taping out the SoC. For example, booting Android and decoding a few frames of video on SOC emulator may take a few hours but could be very useful in analyzing SOC performance.

Due to the growing availability of peripherals on an SoC, SoC emulation is also very useful for performance benchmarking, which can highlight the weaknesses in the design before tapeout. This functionality can reduce risks or subsequent tapeouts associated with unidentified performance shortcomings in the SoC. SoC emulators also make it possible to interface the SoC to a third-party FPGA or soft-model if needed for third-party IP.

Debugging a problem after the arrival of the SoC sample is also helpful with an emulator, given the fact that it runs the same OS, drivers, and framework as the real hardware. Often there is a need to replicate issues observed in the silicon to the emulators, so that it can be investigated at the signal level. Using the same software between emulator and silicon provides faster and more accurate reproduction of the issues, giving full access to the details inside the chip.

Comparing the different SoC software development approaches, using SoC emulators is a better choice from a pre-silicon development and post-silicon debug perspective. The cost for software teams to run SoC emulators may seem expensive. But the contributions SoC emulators provide by making production software available sooner and helping reduce risks and costs, may prove invaluable when considering the impact to time to market targets. Other software development approaches don’t have the same coverage, which is risky, and may require larger software team resourcing. All factors considered, using software development approaches other than SoC emulators may prove to be much more costly in comparison.


Figure 2. Comparative execution speed of each solution. (Source: Nitin Garg)

Per Moore’s law, the transistor count doubles every two years in an integrated circuit (IC) due to increased functionality of the IC. Most of the ARM based 64-bit SoCs today have 100-300 million logic gates. Of the current SoC software development approaches, SoC emulators have proven to scale and support the needs of software development teams facing the challenges associated with the increasing complexities of SoCs in today’s competitive market.

References

  1. Trimberger, Stephen M. “Three Ages of FPGAs.” IEEE Xplore Full-Text PDF: 2015, ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7086413.
  2. BRUNET, JEAN-MARIE. “Why Modern SoC Designs Embrace Emulation.” Embedded Computing Design, 5 Sept. 2018, embedded-computing.com/embedded-computing-design/why-modern-soc-designs-embrace-emulation.
  3. “Soc Emulation.” Soc Emulation, 2019, www.aldec.com/en/solutions/hardware_emulation_solutions/co-emulation–soc-emulation.
  4. “Cramming More Components onto Integrated Circuits.” http://www.cs.utexas.edu/, 2006, cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf.

Nitin Garg is a Principal Engineer at NXP Semiconductors USA, inc with over 20 years of experience in the field of embedded system software. Nitin is Global Kernel BSP lead for i.MX software and responsible for pre-silicon development via emulation, SOC bringup, validation, Embedded software development, delivery and support of Kernel BSP.

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.