The design of low-power chips has taken on a fundamental role in recent years, due to the growing demand for electronic devices that are increasingly miniaturized and with increasingly reduced power consumption to support battery power. The use of artificial intelligence (AI) — increasingly present in wearable devices, in IoT devices, and, more generally, in embedded systems — is imposing arduous challenges on designers committed to developing low-power chips with denser and more innovative architectures and manufacturing processes.
Appropriate power-analysis techniques and tools are needed to assist engineers in the design of advanced AI chips addressing their specific requirements, such as overall functionality, manufacturability, cost, and reliability.
The aim of low-power design is to reduce the overall dynamic and static power consumption of an integrated circuit (IC), a crucial aspect for enabling next-generation applications. This procedure involves the reduction of both dynamic and static power. Dynamic power includes switching and short-circuit power analysys, whereas static power mainly includes leakage current analysis. The power equation, which includes the three contributions mentioned above, is shown in Figure 1.
Figure 1: Power components and equation (Source: Synopsys)
In the years when the IC manufacturing process was based on technologies from 90 nm to 16 nm, the attention of the designers was focused on reducing leakage power, as it had a greater weight (85% to 95%) than dynamic power (10% to 15%). With the subsequent transition from 16 nm to 14 nm, the power equation changed; leakage power was quite under control, while dynamic power became a more important issue. That was due, above all, by the transition from the planar to the FinFET transistor architecture, a multigate device built on a substrate where the gate is placed on two, three, or four sides of the channel or wrapped around the channel, forming a double or even multi-gate 3D structure.
In the next few years, the continuous advances made in the field of electronic manufacturing will lead to manufacturing processes at 7, 5, or even 3 nm, bringing again to the fore the importance of leakage power.
The new challenges of AI
The increasingly widespread use of AI in electronic applications introduces new types of power challenges. The performance, power, and area (PPA) paradigm remains a target to be achieved by designers. The difference is that, with the introduction of the AI chip, it gets more difficult to maximize power without sacrificing power. Today, performance is actually limited by power, and it’s very hard to deliver power reliably to every part of the chip without worrying about the dispersed heat and thermal management.
The quality of the vectors, defined as the realistic activity seen when the SoC is working in a real system, is crucial for dynamic power analysis and optimization.
“The biggest problem is to estimate the workload, especially when the SoC is running in the field, on a real system,” said Godwin Maben, low-power architect and fellow for Synopsys Design Group. “We need to know the workload for measuring and optimizing the dynamic power. When it comes to AI, there are no predefined benchmarks. We need to identify these workloads, make sure they are captured and that power is debugged earlier.”
Design for low power means to understand power’s ramifications across software development, hardware design, and manufacturing. It is not a single-step activity and should be run throughout the entire chip design process, with the aim of reducing the overall dynamic and static power consumption.
As shown in Figure 2, the design and verification methodology is divided into five main phases:
- Static power verification and exploration
- Dynamic power verification and analysis
- Software-driven power analysis
- Power implementation
Figure 2: Design and verification phases (Source: Synopsys)
The role of emulation
Providing an estimate about the power consumption of an SoC is a difficult task, which requires designers to set up testbenches capable of reproducing real operating conditions as faithfully as possible. The best system capable of meeting these requirements is emulation.
Running a power analysis for an AI chip requires suitable tools able to acquire and process hundreds of gigabytes, comprised of trillions or billions of clock cycles. Power profiling within an emulation system helps to solve this issue, as it can identify only the windows of interest for power analysis.
“With AI chips, two new concepts came in,” said Maben. “The first one is that verification debug is challenging, because it takes a long time. The second one is how to develop application software that can be ready by the time the chip is up. This is where the concept of emulation and prototyping came into the picture.”
Thanks to its unique Fast Emulation architecture, the most advanced commercial FPGAs, and innovations in FPGA-based emulation software, Synopsys’s ZeBu Server is the industry’s fastest emulation system, delivering 2× the performance of legacy emulation solutions. ZeBu software provides the users valuable tools such as fast compiler, advanced debug (including native integration with Verdi), simulation acceleration, hybrid emulation, and power analysis.
When an application is run on an emulator, it eventually gets translated into vectors for the SoC. These vectors can then be used to run a simulation, thus validating the functionality of the chip in the emulator. Emulation is the right platform to get the workload, as it generates the vectors targeted for power-analysis optimization. As shown in Figure 3, ZeBu EmPower vectors are used by PrimePower RTL to provide useful information to the designers.
Figure 3: Synopsis software-driven SoC activity (Source: Synopsys)
AI chips use a lot of mathematical functions, mainly multiplication and matrix manipulation, performed by a dedicated and optimized combinatory logic.
“The moment we go into these compute-intensive applications, the new concept that designers are worried about is the glitch power at a lower geometry,” said Maben. “Glitch power is more than 25% of the total power, and we know glitch power means a waste of power.”
The amount of glitch is proportional to the number of operations executed by the SoC, making glitch an important problem to address for AI accelerators. There are two types of glitches: inertial glitches and transportation glitches. Inertial glitches can be addressed architecturally, whereas transportation glitches are due to the delay through the logic cells, causing different arrival time at the input of the logic gates. Glitches are becoming a very big topic, as they are very hard to optimize and very hard to measure.
Synopsys offers an end-to-end RTL-to-gate glitch power analysis and optimization solution. At RTL, PrimePower RTL (see Figure 4) can compute and report glitch per hierarchy, and it can also point to the RTL source line of code generating the highest level of glitch. The PrimePower solution also offers delay-/glitch-aware vector generation using RTL simulation and can perform glitch power analysis using zero-delay gate-level simulation or timing-aware simulation correlating closely to SPICE power numbers.
“Glitches are becoming dominant, especially in AI chips and at a lower geometry,” said Maben. “There are tools like PrimePower RTL, which can tell the designer which blocks are more glitchy and rank them. Architects can then change the architecture to make it less glitchy.”
Figure 4: PrimePower RTL glitch power analysis (Source: Synopsys)
>> This article was originally published on our sister site, Power Electronics News.
- Synopsys outlines vision of AI driven chip design
- New ML-based tool offers automated chip design flow optimization
- Software-driven power analysis
- Using power analysis to optimize battery life in IoT devices
- Using open loop analysis to model power converters with multiple feedback paths
- AI chip’s analog-processing approach slashes power
For more Embedded, subscribe to Embedded’s weekly email newsletter.